Stop the increase of unlocked expired artifacts waiting for removal
This is step 1 of a 2-step solution; see #322817 (closed) for step 2.
Problem to solve
We have around 800TB
(80 millions records) of removable artifacts that can't be removed because of inefficient database queries.
Ci::DestroyExpiredJobArtifactsService
is triggered every 7 minutes to remove expired and unlocked artifacts. It looks at 100000 artifacts that have the expire_at
value less than the current timestamp and it selects the ones that are unlocked
. This is not efficient because it can't reach artifacts with an older expire_at
value. It's not possible to add another index for this query because we need to join three large tables to gather the data.
Proposal
We need to stop the increasing number of expired artifacts waiting for removal as a first iteration before we clean up old expired artifacts (#322817 (closed)) in a background migration.
Ci::UnlockArtifactsService
returns the pipeline ids that it unlocks. We should use this service and the pipeline ids to update the expire_at
timestamp for the artifacts to a more recent value. This will push the removable artifact at the top of the queue making the removal process more efficient.
diff --git a/app/services/ci/unlock_artifacts_service.rb b/app/services/ci/unlock_artifacts_service.rb
index 07faf90dd6d..9f8d9416c7d 100644
--- a/app/services/ci/unlock_artifacts_service.rb
+++ b/app/services/ci/unlock_artifacts_service.rb
@@ -17,7 +17,9 @@ def execute(ci_ref, before_pipeline = nil)
SQL
loop do
- break if ActiveRecord::Base.connection.exec_query(query).empty?
+ result = ActiveRecord::Base.connection.exec_query(query)
+ break if result.empty?
+ refresh_expire_at_timestamp(result)
end
end
@@ -29,5 +31,12 @@ def collect_pipelines(ci_ref, before_pipeline)
pipeline_scope.artifacts_locked
end
+
+ # This will be more complex in the actual implementation because a batch of 100 pipelines for the GitLab project has 32808 artifacts, so it needs some form of batching.
+ # https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/2706/commands/8325
+ def refresh_expire_at_timestamp(result)
+ Ci::JobArtifact
+ .expired_before(1.hour.ago)
+ .where(job_id: Ci::Build.in_pipelines(result.rows))
+ .update_all(expire_at: 5.seconds.ago) # keep them expired.
+ end
end
end