Unwrap Sidekiq exceptions and jobs
What does this MR do?
Unwrap sidekiq exceptions in logs
Closes gitlab-com/gl-infra/scalability#640 (closed)
In the Sidekiq structured logs, this masks the underlying error class and message, making it difficult to determine what the underlying cause of the failure was. While digging into Sentry stuff, I found this interesting pieces of data. You can view the data here if they are not flushed away.
- In 97% of the cases, when jobs can be retried, the first line always logs
Sidekiq::JobRetry::Skip
. The real exception is logged in the second consecutive line. Sadly, that line doesn't contain thejob_status
information. So, the root cause is nearly invisible from the normal query. - In the rest 3%, the exception class is attached inside the first line. The second line becomes redundant.
This PR is to merge those two logs into one:
- Unwrap the exception from
Sidekiq::JobRetry::Skip
exceptions - Add backtrace to the Structured Logs
- Remove
Gitlab::SidekiqLogging::ExceptionHandler
Unwrap ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper in logs and metric middlewares
Closes gitlab-com/gl-infra/scalability#435 (closed)
Any job created by ActiveJob is wrapped into ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper
. It doesn't provide much information when looking at logs or metric dashboards. This MR also unwraps this wrapper to reveal the real underlying workers
Screenshots (strongly suggested)
Before - Sidekiq logs two lines of error logs.
The first line doesn't have the backtrace. The error class and error message is wrapped.
The second line includes the backtrace, but some information are missing, especially job_status
, error_class
, error_message
.
After - Sidekiq logs a single line of error log
Before - Sidekiq logs Active Job wrapper
After - Sidekiq logs wrapped class of Active Job
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team