Long-running transaction in Security::StoreScansWorker when deduplicating security_findings
Summary
The Security::StoreScansWorker
sidekiq worker can spend a lot of time in a single postgres transaction when there are many matching security_findings
rows to be deduplicated.
The statements in those transactions look like:
UPDATE "security_findings"
SET "deduplicated" = $1
WHERE "security_findings"."scan_id" = $2
AND "security_findings"."deduplicated" = $3
AND "security_findings"."id" >= $4
AND "security_findings"."id" < $5
In a recently observed case, the time spent in a single transaction was longer than 10 minutes.
Impact
Long-running transactions negatively impact the overall performance of the database, as they inhibit autovacuum from making progress.
Sample incident:
Recommendation
Break up the update queries into smaller batches with one transaction per batch. Also consider breaking up the sidekiq jobs into smaller batches to avoid them getting killed during deployments.
Implementation Plan
-
Remove transaction from here. -
Consider increasing the DEDUPLICATE_BATCH_SIZE to reduce the # of round trips to the database. -
Add test coverage to ensure that the body of this block is being executed.
Verification
No easy metric to check this unfortunately. A "longest transaction" log field would help here.