Destroy unlocked and expired pipeline artifacts
What does this MR do and why?
Currently the SQL query for finding expired CI pipeline artifacts (not to be confused with CI job artifacts) can time out since it may have to perform a nested loop join between ci_pipeline_artifacts
with ci_pipelines
.
In !97194 (merged), we added a locked
column to speed up the finding of expirable artifacts. When a pipeline is unlocked, it will also unlock the associated pipeline artifacts (!97228 (merged)).
This commit takes advantage of this denormalization by modifying Ci::PipelineArtifacts::DestroyAllExpiredService
to destroy unlocked and expired artifacts with the help of this locked
column. Deleting these artifacts via this fast query can be enabled via the ci_destroy_unlocked_pipeline_artifacts
feature flag.
The legacy query runs afterwards to ensure that pipeline artifacts with the denormalized locked
column get a chance to be deleted.
Relates to #372500 (closed)
SQL queries
Legacy queries
SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" INNER JOIN "ci_pipelines" ON "ci_pipelines"."id" = "ci_pipeline_artifacts"."pipeline_id" WHERE "ci_pipelines"."locked" = 0 AND "ci_pipeline_artifacts"."expire_at" < '2022-09-03 14:08:04.682420' LIMIT 100
gitlabhq_production=# explain (analyze, buffers) SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" INNER JOIN "ci_pipelines" ON "ci_pipelines"."id" = "ci_pipeline_artifacts"."pipeline_id" WHERE "ci_pipelines"."locked" = 0 AND "ci_pipeline_artifacts"."expire_at" < '2022-09-03 14:08:04.682420' LIMIT 100;
Limit (cost=1.00..540.07 rows=100 width=169) (actual time=301019.448..304370.825 rows=1 loops=1)
Buffers: shared hit=3921598 read=1262175
I/O Timings: read=295778.275
-> Nested Loop (cost=1.00..2973129.73 rows=551532 width=169) (actual time=301019.447..304370.823 rows=1 loops=1)
Buffers: shared hit=3921598 read=1262175
I/O Timings: read=295778.275
-> Index Scan using index_ci_pipeline_artifacts_on_expire_at on ci_pipeline_artifacts (cost=0.43..148004.78 rows=861831 width=169) (actual time=2.472..7273.815 rows=861384 loops=1)
Index Cond: (expire_at < '2022-09-03 14:08:04.68242+00'::timestamp with time zone)
Buffers: shared hit=798784 read=77959
I/O Timings: read=5648.962
-> Index Scan using ci_pipelines_pkey on ci_pipelines (cost=0.57..3.28 rows=1 width=4) (actual time=0.344..0.344 rows=0 loops=861384)
Index Cond: (id = ci_pipeline_artifacts.pipeline_id)
Filter: (locked = 0)
Rows Removed by Filter: 1
Buffers: shared hit=3122814 read=1184216
I/O Timings: read=290129.313
Planning Time: 3.915 ms
Execution Time: 304370.874 ms
New queries
SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" WHERE "ci_pipeline_artifacts"."expire_at" < '2022-09-22 17:40:55.175054' AND "ci_pipeline_artifacts"."locked" = 0 LIMIT 100
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..35.25 rows=100 width=171) (actual time=0.023..0.527 rows=100 loops=1)
Buffers: shared hit=452
-> Index Scan using ci_pipeline_artifacts_on_expire_at_for_removal on ci_pipeline_artifacts (cost=0.42..62604.77 rows=179750 width=171) (actual time=0.021..0.515 rows=100 loops=1)
Index Cond: (expire_at < '2022-09-22 17:40:55.175054+00'::timestamp with time zone)
Buffers: shared hit=452
Planning Time: 0.171 ms
Execution Time: 0.558 ms
(7 rows)
How to set up and validate locally
- Set up a project with test coverage: https://docs.gitlab.com/ee/ci/testing/test_coverage_visualization.html
- Run a few pipelines.
- In
gdk psql
, check thatSELECT COUNT(*) FROM ci_artifacts_pipelines
. - Manually run
Ci::PipelineArtifacts::ExpireArtifactsWorker.new.perform
. - Repeat step 3. See that the artifacts have been pruned.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.