Enormous workloads for ReactiveCachingWorker for Projects::MergeRequestsController#sast_reports
GitLab.com has been receiving alerts for Thread Contention from Sidekiq processes: https://gitlab.slack.com/archives/CD6HFD1L0/p1619686007402300
Digging into these reports, the problem appears to be long running ReactiveCachingWorker
jobs, invoked from Projects::MergeRequestsController#sast_reports
https://log.gprd.gitlab.net/goto/9275350a8eaf33d8cdf85396bff17507
- Jobs run for up to an hour (Sidekiq jobs should not run for more than 10m)
- These jobs are almost completely CPU bound, spending almost all their time on-thread, with relatively few calls to external services, Redis, Postgres etc
- They use a staggering amount of memory: up to 50GB per invocation
Ruby doesn't handle this type of multi-threading well. These jobs are noisy-neighbours and will slow down other jobs running in the same processes.
Implementation plan
- Revert changes from !54608 (diffs)
- Put them back under a feature flag
- Enable feature flag for gitlab-com
Edited by Michał Zając