Optimize purging stale security scan records
What does this MR do and why?
This change will two effects on the current logic;
- Fixes a query that is related to purging stale
security_scans
records and timing out on production. - Changes the way we purge stale
security_scans
records to and purge all the scans of a pipeline together by setting the samecreated_at
for all thesecurity_scans
related to the same pipeline.
Database review
Query details
Previously timing out query
SELECT
"security_scans"."id"
FROM
"security_scans"
WHERE ("security_scans"."status" = $1
OR "security_scans"."status" = $2)
AND (created_at < $3)
ORDER BY
"security_scans"."id" ASC
LIMIT $4
And its execution plan(https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12892/commands/45284);
Limit (cost=0.44..1.24 rows=1 width=8) (actual time=341675.932..341675.936 rows=1 loops=1)
Buffers: shared hit=7719221 read=619015 dirtied=54076 written=657
I/O Timings: read=320717.885 write=49.814
-> Index Scan using security_scans_pkey on public.security_scans (cost=0.44..2227992.61 rows=2773130 width=8) (actual time=341675.924..341675.924 rows=1 loops=1)
Filter: ((security_scans.created_at < '2022-08-03 11:43:41.78491+00'::timestamp with time zone) AND ((security_scans.status = 1) OR (security_scans.status = 5)))
Rows Removed by Filter: 17156902
Buffers: shared hit=7719221 read=619015 dirtied=54076 written=657
I/O Timings: read=320717.885 write=49.814
New query which has less conditions
SELECT
"security_scans".*
FROM
"security_scans"
WHERE (created_at < '2022-08-05 13:28:28.404336')
AND "security_scans"."status" != 6
ORDER BY
"security_scans"."id" ASC
LIMIT 1
And its execution plan with the index(https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12950/commands/45478);
Limit (cost=0.43..0.61 rows=1 width=71) (actual time=0.059..0.060 rows=1 loops=1)
Buffers: shared hit=4
I/O Timings: read=0.000 write=0.000
-> Index Scan using test_index_3 on public.security_scans (cost=0.43..535479.56 rows=2981803 width=71) (actual time=0.058..0.058 rows=1 loops=1)
Filter: (security_scans.created_at < '2022-08-05 13:28:28.404336+00'::timestamp with time zone)
Rows Removed by Filter: 0
Buffers: shared hit=4
I/O Timings: read=0.000 write=0.000
Migration task outputs
rake db:migrate:up
main: == 20221103131409 AddPartialIndexOnPrimaryKeyOfSecurityScans: migrating =======
main: -- transaction_open?()
main: -> 0.0000s
main: -- index_exists?(:security_scans, :id, {:where=>"status != 6", :name=>:index_security_scans_on_id_for_non_purged_records, :algorithm=>:concurrently})
main: -> 0.0046s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0002s
main: -- add_index(:security_scans, :id, {:where=>"status != 6", :name=>:index_security_scans_on_id_for_non_purged_records, :algorithm=>:concurrently})
main: -> 0.0018s
main: -- execute("RESET statement_timeout")
main: -> 0.0003s
main: == 20221103131409 AddPartialIndexOnPrimaryKeyOfSecurityScans: migrated (0.0112s)
rake db:migrate:down
main: == 20221103131409 AddPartialIndexOnPrimaryKeyOfSecurityScans: reverting =======
main: -- transaction_open?()
main: -> 0.0000s
main: -- indexes(:security_scans)
main: -> 0.0032s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0002s
main: -- remove_index(:security_scans, {:algorithm=>:concurrently, :name=>:index_security_scans_on_id_for_non_purged_records})
main: -> 0.0044s
main: -- execute("RESET statement_timeout")
main: -> 0.0002s
main: == 20221103131409 AddPartialIndexOnPrimaryKeyOfSecurityScans: reverted (0.0124s)
Related to Implement retention period for Security::Findin... (#351524 - closed).
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Mehmet Emin INAC