API to bulk delete artifacts
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
Release notes
Setting an aggressive expiration policy for artifacts is one strategy for managing storage consumption, but sometimes you need the ability to reduce storage right away. While previously you may have relied on a script to automate the tedious task of calling the API to delete an artifact one by one, now you can use a new endpoint to bulk delete job artifacts.
https://docs.gitlab.com/ee/api/job_artifacts.html#delete-artifacts
Problem to solve
Today there is no method to clean up old artifacts in bulk, short of writing your own script using the API. This present a challenge, as Artifacts are a major driver of storage consumption and many users today do not have any expiration policy set (currently by default artifacts never expire).
Even future work to set a default of one-year expiration may not clean up old artifacts quickly enough to reduce usage below limits defined for storage quotas.
Intended users
User experience goal
Give users an API to deleting artifacts in bulk without them having to write their own scripts in the API or deleting these by hand one by one.
Proposal
Create an API to bulk delete artifacts per project. The scope of work for this issue is to delete artifacts for a given project via an API endpoint.
Out of scope (follow-ups)
Add parameters to bulk delete endpoints:
-
delete_all
: should allow for the deletion of non-erasable artifacts(trace files) -
delete_tags
: delete artifacts also for tags, -
keep_n
: keep N last artifacts for succesful pipelines on each branch, -
delete_orphaned
: delete all orphaned branch and tags (no longer present in git repo), -
older_than
: delete older than
Delete all of the artifacts for a given group.
Technical approach
Extract the part which removes expired artifacts into a new service that accepts an array of job artifacts. This service could be reused for this API. Note we should remove only erasable artifacts by default.
Further details
We are working to apply a default expiration, but to avoid deleting important customer data we have a long grace period where users can retain their artifacts: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10177, currently artifacts will not expire sooner than 1 year.
The support team has workaround process to do something similar, documented at: https://docs.gitlab.com/ee/administration/job_artifacts.html#delete-job-artifacts-from-jobs-completed-before-a-specific-date
Permissions and Security
Who can use this API for a project is critical. Similar to the existing CI/CD permissions related to deleting a project use of this API should be limited to maintainers and owners.
Documentation
Availability & Testing
- Unit and feature tests for new API endpoint are required for each parameter. Ensure each delete action does not delete trace files.
- No End-to-End test required
What does success look like, and how can we measure that?
- A user is able to delete all erasable artifacts with one API request
- The API endpoint is documented