Filter sensitive Sidekiq arguments
What does this MR do?
Sidekiq arguments can show up in a few places:
- In Sentry.
- In logs, if
SIDEKIQ_LOG_ARGUMENTS=1
: https://docs.gitlab.com/ee/administration/troubleshooting/sidekiq.html#log-arguments-to-sidekiq-jobs - In
exceptions_json.log
: https://docs.gitlab.com/ee/administration/logs.html#exceptions_jsonlog
This MR applies the same filtering to all of those places. Because Sidekiq arguments are positional, we can't use Rails's parameter filtering as there is no argument 'name'. Instead, we do this:
- By default, only numeric arguments are allowed. All other arguments are replaced with the string '[FILTERED]'.
- We provide some overrides for worker classes to permit arguments to be logged by position if they're known-safe.
Additionally, in some cases we'd send a jobstr
to Sentry or logs. This was a JSON string representing the job hash (containing the arguments). While we could parse this, filter the arguments, and convert back to JSON, instead we just ignore it - it's not been useful in our case.
How do I test this locally?
-
Enable Sentry in
config/gitlab.yml
. You can use my project if you like: https://sentry.gitlab.net/gitlab/sean-test/ -
Apply this patch. It simply adds a new worker,
ErrorWorker
, and configures it so the last argument is permitted to be logged.diff --git a/app/workers/error_worker.rb b/app/workers/error_worker.rb new file mode 100644 index 00000000000..3ac9e5ba22b --- /dev/null +++ b/app/workers/error_worker.rb @@ -0,0 +1,20 @@ +# frozen_string_literal: true + +class ErrorWorker + include ApplicationWorker + + sidekiq_options queue: :merge, retry: false + loggable_arguments 3 + + def perform(foo, bar, secret, not_secret) + foo / bar + rescue => e + # This will: + # 1. Log to exceptions_json.log + # 2. Send to Sentry + # 3. Re-raise and then: + # 1. Log to sidekiq_json.log + # 2. Send to Sentry again + Gitlab::ErrorTracking.track_and_raise_exception(e) + end +end
-
Restart
rails-background-jobs
. -
Run
tail -f log/exceptions_json.log | grep 'vwxyz'
andgdk tail rails-background-jobs | grep 'vwxyz'
. -
Run
bundle exec rails r "ErrorWorker.perform_async(1, 0, 'abcde', 'vwxyz')"
.
You should see abcde
filtered from the logs:
$ tail -f log/exceptions_json.log | grep 'vwxyz'
{"severity":"ERROR","time":"2020-06-11T09:41:54.319Z","correlation_id":"fc92b1900151b86ad7e5d06f34ada9a1","extra.sidekiq":{"class":"ErrorWorker","args":["1","0","[FILTERED]","vwxyz"],"retry":false,"queue":"merge","backtrace":true,"jid":"a4b23fb337180a9ccbab4616","created_at":1591868513.138161,"correlation_id":"fc92b1900151b86ad7e5d06f34ada9a1","enqueued_at":1591868513.1604931},"exception.class":"ZeroDivisionError","exception.message":"divided by 0","exception.backtrace":["app/workers/error_worker.rb:10:in `/'","app/workers/error_worker.rb:10:in `perform'","lib/gitlab/with_request_store.rb:17:in `enabling_request_store'","lib/gitlab/with_request_store.rb:10:in `with_request_store'","lib/gitlab/sidekiq_daemon/monitor.rb:49:in `within_job'"]}
$ gdk tail rails-background-jobs | grep 'vwxyz'
(in /Users/smcgivern/Code/gdk)
2020-06-11_09:41:53.33645 rails-background-jobs : {"severity":"INFO","time":"2020-06-11T09:41:53.336Z","class":"ErrorWorker","args":["1","0","[FILTERED]","vwxyz"],"retry":0,"queue":"merge","backtrace":true,"jid":"a4b23fb337180a9ccbab4616","created_at":"2020-06-11T09:41:53.138Z","correlation_id":"fc92b1900151b86ad7e5d06f34ada9a1","enqueued_at":"2020-06-11T09:41:53.160Z","pid":79168,"message":"ErrorWorker JID-a4b23fb337180a9ccbab4616: start","job_status":"start","scheduling_latency_s":0.175625}
2020-06-11_09:41:54.34315 rails-background-jobs : {"severity":"WARN","time":"2020-06-11T09:41:54.342Z","class":"ErrorWorker","args":["1","0","[FILTERED]","vwxyz"],"retry":0,"queue":"merge","backtrace":true,"jid":"a4b23fb337180a9ccbab4616","created_at":"2020-06-11T09:41:53.138Z","correlation_id":"fc92b1900151b86ad7e5d06f34ada9a1","enqueued_at":"2020-06-11T09:41:53.160Z","pid":79168,"message":"ErrorWorker JID-a4b23fb337180a9ccbab4616: fail: 1.006461 sec","job_status":"fail","scheduling_latency_s":0.175625,"db_count":0,"db_write_count":0,"db_cached_count":0,"duration_s":1.006461,"cpu_s":0.278141,"completed_at":"2020-06-11T09:41:54.342Z","error_message":"divided by 0","error_class":"ZeroDivisionError","db_duration_s":0.004778}
2020-06-11_09:41:54.97960 rails-background-jobs : {"severity":"WARN","time":"2020-06-11T09:41:54.979Z","error_class":"ZeroDivisionError","error_message":"divided by 0","context":"Job raised exception","class":"ErrorWorker","args":["1","0","[FILTERED]","vwxyz"],"retry":0,"queue":"merge","backtrace":true,"jid":"a4b23fb337180a9ccbab4616","created_at":"2020-06-11T09:41:53.138Z","correlation_id":"fc92b1900151b86ad7e5d06f34ada9a1","enqueued_at":"2020-06-11T09:41:53.160Z","db_count":0,"db_write_count":0,"db_cached_count":0,"error_backtrace":["app/workers/error_worker.rb:10:in `/'","app/workers/error_worker.rb:10:in `perform'","lib/gitlab/with_request_store.rb:17:in `enabling_request_store'","lib/gitlab/with_request_store.rb:10:in `with_request_store'","lib/gitlab/sidekiq_daemon/monitor.rb:49:in `within_job'"]}
In Sentry, you'll see two events for the same exception. The one with a correlation ID comes from Gitlab::ErrorTracking
. Both of them should have the abcde
argument filtered, like they do here: https://sentry.gitlab.net/gitlab/sean-test/issues/1644810/