Fix Debian background job
What does this MR do and why?
quoting @10io
for #340307 (closed):
Tests on staging showed that background jobs for the debian registry are not working as expected with object storage.
The main takeaway is that the background job code is not compatible with object storage. Next, the current code will not properly handle large files (see this bug)
Here are the details:
🐛 When object storage is enabled, the first background job (that processes the.changes
file) is not working. (blocker)
- This is coming on the actions in the package file object. We update the
package
reference but we can't simplypackage_file.update!(package: package)
. This will break the link between thepackage_file
and the physical file in object storage.
^ Fixed (using ::Packages::UpdatePackageFileService
).
- I see several uses of
#use_file
. This is known to leave thepackage_file
in an improper state that can't be saved. Basically, ifpackage_file.file.use_file
is called, thenpackage_file.save!
orpackage_file.update!
can't be called (triggers a validation error).🔧 Solution: package file updates where the package id or file_name are modified must be handled in a custom way. This is the result of the work on a similar nuget bug. In short, use this. Also useuse_open_file
which is better thanuse_file
.
^ Fixed (#use_file
replaced by #use_open_file
).
🐛 When an error occurs during the first background job, the files are left behind in theincoming
package. Those files takes physical space in object storage.
- The situation is the following, when uploading a debian package, multiple files are uploaded and the last upload is the
.changes
file.- All the package files are put in the same package: a versionless package named
incoming
.- The problem with this approach is that if the worker fails, it will delete the
.changes
file but not the others.
^ not fixed in this MR. Proposal: cleanup old incoming files with a cronjob (see Prune dangling incoming files
from #300917 (closed))
🐛 When a package is uploaded and it references acodename
not present in the project distributions, the second job is still enqueued.
- I'm not sure why the background job doesn't simply fail if the
codename
is not found.
^ Added to #300917 (closed).
🐛 The process changes worker moves the package files in a way that will not support large files. Similar to the nuget bug we got a few weeks ago.
🔧 Solution: same as above, switch to the new service that updates the package files in a way that is optimized for large files📈 Code cleanup/refactoring. Debian package files are using a custom object storage path generator.
- This is not necessary and actually can be risky. That change is not compatible with the service that updates the package files.
^ Fixed (using ::Packages::UpdatePackageFileService
, and custom object storage path removed).
Edited by Mathieu Parent