Add background jobs for cleanup policies for packages
⚙ Context
The Packages Registry works with these core models (simplified):
flowchart LR
Group -- 1:n --> Project
Project -- 1:n --> Package
Package -- 1:n --> PackageFile
PackageFile -- 1:1 --> os((Object Storage file))
For some package formats, we allow republishing a package version. What happens is that we append the package files to the existing package.
With time, some packages can end up with many package files. All these package files take space on Object Storage.
With #346153 (closed), we're kicking off the work for adding cleanup policies for packages. Basically, users will be allowed to define cleanup rules and the backend will regularly execute the policy to remove packages/package files that are not kept by the policy.
In true iteration spirit, the first iteration will have a single rule. Users will be able to define how many duplicated package files (by filename) need to be kept.
Example: for maven package, a pom.xml
is uploaded on each publication. If you publish the same version 100 times, you end up with 100 pom.xml
package files. Users will be able to state that they only want to keep the 10 most recent pom.xml
files.
For this feature, there are several backend parts:
- The policy model. That's !85918 (merged).
- Expose the policy object through GraphQL. That's !87799 (merged).
- The execute policy service. That's !90395 (merged).
- The background job that executes the cleanup policies (through the service added in (3.)).
👈 This MR
This is issue #346153 (closed).
As stated above, this MR focuses on introducing all the background worker changes that will execute policies. The execution itself is handled by a service that was introduced with !90395 (merged).
Here, the goal is to collect the policies that need to be run and execute them one by one. For this aspect, we're going to leverage the LimitedCapacity::Worker
concern. Basically, as time passes by a backlog of policies that needs to be executed is created (each policy has a next_run_at
column). This backlog is processed by a number of concurrent jobs that loop on themselves until the backlog is empty. The number of concurrent jobs is a new application setting so that we can fine tune the pressure on Sidekiq and self-managed admins can tweak this number to their setup.
This is all great but we need a way to kickstart the "loop" of self enqueueing jobs. For this, we're going to use a cron job that regularly check if there are some policies to execute or not. If they are some, it will enqueue the limited capacity job that will start the "loop".
🔬 What does this MR do and why?
- Introduce the
Packages::Cleanup::ExecutePolicyWorker
as aLimitedCapacity::Worker
. - The capacity for this worker is set by an newly introduced application setting:
package_registry_cleanup_policies_worker_capacity
.- Expose this application setting on the usual endpoint.
- That worker will use the existing
Packages::Cleanup::ExecuteService
. - Update the
Packages::CleanupPackageRegistryWorker
worker to kickoffPackages::Cleanup::ExecutePolicyWorker
if necessary.- Also this parent worker will dump metrics on policies (how many are runnable).
- Update the cleanup policy model to support the background worker.
- Add the relevant database index to get the runnable cleanup policies.
- Update the relevant specs.
🖼 Screenshots or screen recordings
n / a
📐 How to set up and validate locally
We're going to create a bunch of dummy packages with duplicates packages. We will then create a packages cleanup policy to keep only 1
duplicated package file = only the most recent one will be kept.
We don't want to wait for the next_run_at
of the policy to be executable, so we will modify it to make the policy runnable.
Finally, we will run the cron job that will kick off the limited capacity job. That job will execute our policy and mark for destruction the intended package files.
Let's get started. In a rails console:
- Follow this to define a
fixture_file_upload
function. - Let's create 3 packages:
project = Project.first pkg1 = FactoryBot.create(:generic_package, project: project) pkg2 = FactoryBot.create(:generic_package, project: project) pkg3 = FactoryBot.create(:generic_package, project: project)
- Let's add some dummy files:
FactoryBot.create(:package_file, :generic, package: pkg1, file_name: 'file_for_pkg1.txt') 2.times { FactoryBot.create(:package_file, :generic, package: pkg2, file_name: 'file_for_pkg2.txt') } 3.times { FactoryBot.create(:package_file, :generic, package: pkg3, file_name: 'file_for_pkg3.txt') }
- Check the created files (check the
status
column)pkg1.reload.package_files pkg2.reload.package_files pkg3.reload.package_files
- Create the packages cleanup policy that will keep only
1
duplicated package files:project.packages_cleanup_policy.update!(keep_n_duplicated_package_files: '1') policy = project.packages_cleanup_policy
- Make the policy runnable. We need to use
update_column
as there is a callback on save that updates thenext_run_at
for the next execution. We need to avoid executing that callback.policy.update_column(:next_run_at, 2.minutes.ago)
- Run the cron job
Packages::CleanupPackageRegistryWorker.new.perform
- Let's re inspect files:
pkg1.reload.package_files pkg2.reload.package_files pkg3.reload.package_files
- The most recent package file has
status: 'default'
and all the others hasstatus: 'pending_destruction'
. That's the expected behavior✅
🚦 MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
💾 Database review
⤴ Migration up
main: == 20220713175658 AddPackagesCleanupPoliciesWorkerCapacityToApplicationSettings: migrating
main: -- add_column(:application_settings, :package_registry_cleanup_policies_worker_capacity, :integer, {:default=>2, :null=>false})
main: -> 0.0042s
main: == 20220713175658 AddPackagesCleanupPoliciesWorkerCapacityToApplicationSettings: migrated (0.0050s)
main: == 20220713175737 AddApplicationSettingsPackagesCleanupPoliciesWorkerCapacityConstraint: migrating
main: -- transaction_open?()
main: -> 0.0000s
main: -- current_schema()
main: -> 0.0009s
main: -- transaction_open?()
main: -> 0.0000s
main: -- execute("ALTER TABLE application_settings\nADD CONSTRAINT app_settings_pkg_registry_cleanup_pol_worker_capacity_gte_zero\nCHECK ( package_registry_cleanup_policies_worker_capacity >= 0 )\nNOT VALID;\n")
main: -> 0.0022s
main: -- current_schema()
main: -> 0.0003s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0003s
main: -- execute("ALTER TABLE application_settings VALIDATE CONSTRAINT app_settings_pkg_registry_cleanup_pol_worker_capacity_gte_zero;")
main: -> 0.0013s
main: -- execute("RESET statement_timeout")
main: -> 0.0003s
main: == 20220713175737 AddApplicationSettingsPackagesCleanupPoliciesWorkerCapacityConstraint: migrated (0.0193s)
main: == 20220713175812 AddEnabledPoliciesIndexToPackagesCleanupPolicies: migrating =
main: -- transaction_open?()
main: -> 0.0000s
main: -- index_exists?(:packages_cleanup_policies, [:next_run_at, :project_id], {:where=>"keep_n_duplicated_package_files <> 'all'", :name=>"idx_enabled_pkgs_cleanup_policies_on_next_run_at_project_id", :algorithm=>:concurrently})
main: -> 0.0025s
main: -- add_index(:packages_cleanup_policies, [:next_run_at, :project_id], {:where=>"keep_n_duplicated_package_files <> 'all'", :name=>"idx_enabled_pkgs_cleanup_policies_on_next_run_at_project_id", :algorithm=>:concurrently})
main: -> 0.0029s
main: == 20220713175812 AddEnabledPoliciesIndexToPackagesCleanupPolicies: migrated (0.0123s)
⤵ Migration down
main: == 20220713175812 AddEnabledPoliciesIndexToPackagesCleanupPolicies: reverting =
main: -- transaction_open?()
main: -> 0.0000s
main: -- indexes(:packages_cleanup_policies)
main: -> 0.0052s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0003s
main: -- remove_index(:packages_cleanup_policies, {:algorithm=>:concurrently, :name=>"idx_enabled_pkgs_cleanup_policies_on_next_run_at_project_id"})
main: -> 0.0035s
main: -- execute("RESET statement_timeout")
main: -> 0.0003s
main: == 20220713175812 AddEnabledPoliciesIndexToPackagesCleanupPolicies: reverted (0.0167s)
main: == 20220713175737 AddApplicationSettingsPackagesCleanupPoliciesWorkerCapacityConstraint: reverting
main: -- transaction_open?()
main: -> 0.0000s
main: -- transaction_open?()
main: -> 0.0000s
main: -- execute("ALTER TABLE application_settings\nDROP CONSTRAINT IF EXISTS app_settings_pkg_registry_cleanup_pol_worker_capacity_gte_zero\n")
main: -> 0.0018s
main: == 20220713175737 AddApplicationSettingsPackagesCleanupPoliciesWorkerCapacityConstraint: reverted (0.0121s)
main: == 20220713175658 AddPackagesCleanupPoliciesWorkerCapacityToApplicationSettings: reverting
main: -- remove_column(:application_settings, :package_registry_cleanup_policies_worker_capacity, :integer, {:default=>2, :null=>false})
main: -> 0.0035s
main: == 20220713175658 AddPackagesCleanupPoliciesWorkerCapacityToApplicationSettings: reverted (0.0056s)