Consider lowering the cadence of the registry cleanup job
🔥 Problem
Package file destructions is done by two steps:
- Marking the package file as
pending_destruction
. - Actually, destroy the package file.
(2.) is handled by a background job that is a limited capacity one: it will detect package files marked as pending_destruction
and destroy the first one. Then, it will re-enqueue itself.
In other words, the background job doing (2.) is looping until the backlog of package files marked as pending_destruction
is processed.
That's great but how do we kickstart the loop? Well, we do have a cron job that regularly check if the background job loop for (2.) need to be started.
The current cadence is super large: 12.hours
. This means that we can have up to 12.hours
between step (1.) and (2.).
Given the monitoring on the cron job (we have a max p95 duration from the last 7 days of 4.5seconds
), I think we can safely reduce that cadence to something shorter.
🚒 Solution
Update the cron job for package registry cleanup schedule to run each 1.hour
.
🔮 Side effects
The less time we have between (1.) and (2.), the less chances we have to hit situations like Unable to immediatelly transfer project to anot... (#370834 - closed).