Create worker to store security reports by project
What does this MR do and why?
Technical context
UPSERT
queries require acquiring locks on unique index tuples. This will cause lock contention if multiple processes try to UPSERT
records with the same unique attributes. The lock contention will make each process wait for the other to complete.
Historical context
The StoreSecurityReportsWorker
job has the lock-contention issue described above. It was discovered in this production incident.
Further details from the incident root analysis
We run StoreSecurityReportsWorker
for each pipeline for the default branch if it has security reports. In that worker, within a transaction, we create records for different tables. One of those tables is vulnerability_identifiers
, and the query to create the records for that table is an UPSERT
query.
There were many pipelines for a single project, which caused running many StoreSecurityReportsWorker
jobs in parallel. Each of which tried to UPSERT
the same records, causing lock contention and long transaction times.1
We have implemented a temporary solution in !147816 (merged)
This MR
The short-term solution resolves the lock contention by, in effect, making the jobs run sequentially. However, going through these jobs sequentially can take a long time (somewhere on the order of 3.25 hours to 20 hours[2])
In this change, we implement a medium-term solution that replaces the problematic job with a similar job that can make use of our existing sidekiq deduplication tooling
With this change, even if a single project suddenly has many pipelines created for it, only one job will be scheduled and the rest will be de-duplicated
Follow-up work
-
[Feature flag] Rollout of `deduplicate_security... (#460476 - closed) • Michael Becker • 17.2 -
remove the feature flag: !156374 (merged) -
delete original job: !156374 (merged) -
rename new job from `StoreSecurityReportsByProjectWorker` to `StoreSecurityReportsWorker`
Future Plans
We had originally intended to use the pipeline_metadata
table to pass around the pipeline info for this MR, however we could not use that table as metadata
isn't guaranteed to exist and we cannot create it if it doesn't
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to: #452005 (closed)