Update dependency proxy API to use cleanup worker
🌵 Context
In !70029 (merged), we added cleanup policies to the Dependency Proxy to allow users to configure a regular deletion of the files in their cache.
There is also an API endpoint to fully purge the Dependency Proxy cache.
Currently, the API endpoint kicks off a background job that simply destroys all of the various files (dependency_proxy_blobs
and dependency_proxy_manifests
). Since the cleanup policies have added background job that regularly deletes expired
blobs and manifests, we can utilize this job to make this API endpoint more efficient by updating the API endpoint to expire
the records, which is a simple database update.
In addition to improving ~performance, this also addresses a bug
🔎 What does this MR do and why?
- Update the Dependency Proxy purge cache worker to utilize the newer more optimized file deletion.
- Remove the lease restrictions from the API since the deletion from the Database standpoint will be much faster.
I considered removing the PurgeDependencyProxyCacheWorker
altogether and moving the UPDATE
queries directly to the API, however if there is a group with a large number of blob/manifest records to be updated, the time to complete the update may take a few seconds.
For example, if we have a group with 5000 blob records, we update them in batches of 100. If each update takes 100ms
(see database analysis below), the overall update will take 5 seconds
to complete.
🐘 Database
Note, the examples below all use dependency_proxy_blobs
. The dependency_proxy_manifests
table has the same structure as dependency_proxy_blobs
and will perform similarly. A group will always have more dependency_proxy_blobs
than dependency_proxy_manifests
, so we can expect the blobs
table to have the slower performance of the two.
Queries generated by @group.dependency_proxy_blobs.each_batch(of: UPDATE_BATCH_SIZE)
SELECT id FROM dependency_proxy_blobs WHERE group_id = 9970 ORDER BY id ASC LIMIT 1;
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7645/commands/27154
SELECT id FROM dependency_proxy_blobs WHERE group_id = 9970 AND id >= 4461 ORDER BY id ASC LIMIT 1 OFFSET 100;
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7645/commands/27157
Query generated by batch.update_all(status: :expired)
UPDATE "dependency_proxy_blobs"
SET "status" = 1
WHERE "dependency_proxy_blobs"."group_id" = 9970
AND "dependency_proxy_blobs"."id" >= 4461
AND "dependency_proxy_blobs"."id" < 8831;
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7645/commands/27156
📸 Screenshots or screen recordings
Before:
[16] pry(main)> Group.find(181).dependency_proxy_manifests.map(&:status)
=> ["default", "default"]
[27] pry(main)> Group.find(181).dependency_proxy_blobs.map(&:status)
=> ["default", "default", "default"]
API request:
→ curl --request DELETE -H "PRIVATE-TOKEN: <token>" "http://gdk.test:3001/api/v4/groups/181/dependency_proxy/cache"
202
After:
[19] pry(main)> Group.find(181).dependency_proxy_manifests.map(&:status)
=> ["expired", "expired"]
[20] pry(main)> Group.find(181).dependency_proxy_blobs.map(&:status)
=> ["expired", "expired", "expired"]
How to set up and validate locally
- Create a group
- Log into the dependency proxy (you can use your username/password for credentials, or username/personal_access_token).
docker login gdk.test:3001
- Use the Dependency Proxy to pull a number of images through the group:
docker pull gdk.test:3001/<group_full_path>/dependency_proxy/containers/nginx:latest docker pull gdk.test:3001/<group_full_path>/dependency_proxy/containers/node:latest
- Navigate to the group dependency proxy page to view the pulled images
group -> Packages & Registries -> Dependency Proxy
. You can also view the records in the rails console:Group.last.dependency_proxy_manifests Group.last.dependency_proxy_blobs
- Make a request to the purge API:
curl --request DELETE -H "Private-Token: <personal_access_token>" "http://gdk.test:3001/api/v4/groups/<group_id>/dependency_proxy/cache"
- Check in the Dependency Proxy UI to make sure they are no longer visible. You can also check that all of the records are now expired in the rails console:
Group.last.dependency_proxy_blobs.map(&:status) Group.last.dependency_proxy_manifests.map(&:status)
- (optional) If you'd like to check that they get deleted, you can run the background jobs that delete the blobs and manifests:
DependencyProxy::CleanupBlobWorker.perform_at(1.second) DependencyProxy::CleanupManifestWorker.perform_at(1.second)
Group.last.dependency_proxy_blobs.size Group.last.dependency_proxy_manifests.size
✏ MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #348176