[Draft] Purge `security_findings` for stale security scans
What does this MR do and why?
This MR introduces a background job to delete security_findings
for the previously marked as purged security scans by Mark stale `security_scans` as `purged` (!82711 - merged).
Related to Implement retention period for Security::Findin... (#351524 - closed).
Database Review
Rake command outputs
rake db:migrate:up
== 20220523084003 ScheduleDeletingSecurityFindingsForPurgedSecurityScans: migrating
-- transaction_open?()
-> 0.0000s
-- Scheduled 0 PurgeStaleSecurityScans jobs with a maximum of 10000 records per batch and an interval of 120 seconds.
The migration is expected to take at least 0 seconds. Expect all jobs to have completed after 2022-05-23 14:00:10 UTC."
== 20220523084003 ScheduleDeletingSecurityFindingsForPurgedSecurityScans: migrated (0.0324s)
rake db:migrate:down
== 20220523084003 ScheduleDeletingSecurityFindingsForPurgedSecurityScans: reverting
== 20220523084003 ScheduleDeletingSecurityFindingsForPurgedSecurityScans: reverted (0.0000s)
Queries used
Scheduler migration
Finding the last executed scanner ID from the previous migration
SELECT
"background_migration_jobs".*
FROM
"background_migration_jobs"
WHERE
"background_migration_jobs"."class_name" = 'PurgeStaleSecurityScans'
ORDER BY
"background_migration_jobs"."id" DESC
LIMIT 1
Limit (cost=0.42..6.68 rows=1 width=121) (actual time=12.270..12.273 rows=1 loops=1)
Buffers: shared read=4 dirtied=1
I/O Timings: read=12.055 write=0.000
-> Index Scan using background_migration_jobs_pkey on public.background_migration_jobs (cost=0.42..11587.03 rows=1849 width=121) (actual time=12.267..12.268 rows=1 loops=1)
Filter: (background_migration_jobs.class_name = 'PurgeStaleSecurityScans'::text)
Rows Removed by Filter: 0
Buffers: shared read=4 dirtied=1
I/O Timings: read=12.055 write=0.000
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10218/commands/36272
Getting the lower bound of the first batch
SELECT
"security_findings"."id"
FROM
"security_findings"
WHERE (scan_id <= 17729347)
ORDER BY
"security_findings"."id" ASC
LIMIT 1
Limit (cost=0.57..0.70 rows=1 width=8) (actual time=18.397..18.400 rows=1 loops=1)
Buffers: shared read=5
I/O Timings: read=18.271 write=0.000
-> Index Scan using security_findings_pkey on public.security_findings (cost=0.57..56443171.48 rows=447522256 width=8) (actual time=18.394..18.395 rows=1 loops=1)
Filter: (security_findings.scan_id <= 17729347)
Rows Removed by Filter: 0
Buffers: shared read=5
I/O Timings: read=18.271 write=0.000
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10218/commands/36274
Getting the lower bound of the next batch
SELECT
"security_findings"."id"
FROM
"security_findings"
WHERE (scan_id <= 17729347)
AND "security_findings"."id" >= 1439732
ORDER BY
"security_findings"."id" ASC
LIMIT 1 OFFSET 10000
Limit (cost=1305.98..1306.11 rows=1 width=8) (actual time=1067.122..1067.125 rows=1 loops=1)
Buffers: shared hit=4561 read=704 dirtied=4
I/O Timings: read=1036.905 write=0.000
-> Index Scan using security_findings_pkey on public.security_findings (cost=0.57..58419589.72 rows=447522256 width=8) (actual time=0.057..1062.882 rows=10001 loops=1)
Index Cond: (security_findings.id >= 1439732)
Filter: (security_findings.scan_id <= 17729347)
Rows Removed by Filter: 0
Buffers: shared hit=4561 read=704 dirtied=4
I/O Timings: read=1036.905 write=0.000
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10218/commands/36275
Background migration
Getting the lower bound of the first batch
Getting the lower bound of the next batch
Deleting the records within the batch
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Mehmet Emin INAC