Use `#use_open_file` for NuGet metadata extraction [RUN ALL RSPEC] [RUN AS-IF-FOSS]
🐊 Context
When NuGet packages are uploaded, they can't be used right away. The uploaded archive (a zip one) has a generic filename package.nuget
and that's it. The GitLab package registry will need more information about that package to make it available for pulling (among other things, we need the package name and version).
That is why, NuGet packages are processed by background job that:
- Pull the archive from object storage (or the file system depending on the upload configuration)
- Open the archive and read the
.nuspec
file - Extract the needed metadata from that file
- Update the package model with the proper name and version
The above has been working well since the release of the NuGet package registry in 12.8
In #321361 (closed), it has been reported that big packages (500MB+) fail silently.
That ~bug can't be reproduced locally. We get a broken pipe error. The ~bug is reproducible on staging but the logs are quite surprising:
For some reason, the first job is killed and we don't get any log.
The second will end quickly but we think that this is because we're using #use_file
, which in turn uses an exclusive lock. That lock is still taken from the first job = the second job will simply end its execution without doing anything.
This is unexpected because the lock release is in an ensure
block. So, that block is not executed. The only way this could happen is that the job gets killed in a pretty raw way and it doesn't have time to execute those ensure
blocks. That "raw" shutdown could also explain why we don't have any "end" or "error" log in the first job.
Not being able to pinpoint the root cause of the ~bug, we are trying to use #use_open_file
which doesn't use an exclusive lock. That is okay for our needs because the worker doesn't update the file but merely read it to get data.
This change could potentially impact all the NuGet uploads, we choose to use a feature flag. We're not 100% confident that this change will fix the ~bug, the feature flag allows us to have some trials on staging first before enabling the change on production.
🤔 What does this MR do?
- Use
#use_open_file
instead of#use_file
when pulling the NuGet archive out of object storage (or the file system) - Introduce a feature flag for that change. Rollout issue: #331799 (closed)
- Update the relevant specs
- While at this, I update some of them to
- remove redundant
RSpec.
prefix - use a
let_it_be
var where possible
- remove redundant
- While at this, I update some of them to
💻 Screenshots (strongly suggested)
Given a project with the feature flag enabled:
Feature.enabled?(:packages_nuget_archive_new_file_reader, Project.find(397))
Project Load (0.9ms) SELECT "projects".* FROM "projects" WHERE "projects"."id" = 397 LIMIT 1 /*application:console,line:(pry):2:in `__pry__'*/
=> true
Uploading a nuget package still works: (using gl_pru)
$ bundle exec thor package:push --package-type=nuget --user=root --token="XXXX" --url=http://gdk.test:8000/api/v4/projects/397/packages/nuget/index.json --name=bananas --version=1.3.25
Creating credentials for root
Creating package name: bananas, version: 1.3.25
The template "Console Application" was created successfully.
Processing post-creation actions...
Running 'dotnet restore' on bananas/bananas.csproj...
Determining projects to restore...
Restored /Users/david/projects/gl_pru/pkg/bananas/bananas.csproj (in 66 ms).
Restore succeeded.
Package bananas, version: 1.3.25 created.
Uploading package bananas, version: 1.3.25 to http://gdk.test:8000/api/v4/projects/397/packages/nuget/index.json.
Microsoft (R) Build Engine version 16.8.0+126527ff1 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.
Determining projects to restore...
Restored /Users/david/projects/gl_pru/pkg/bananas/bananas.csproj (in 70 ms).
bananas -> /Users/david/projects/gl_pru/pkg/bananas/bin/Release/netcoreapp3.0/bananas.dll
Successfully created package '/Users/david/projects/gl_pru/pkg/bananas/bin/Release/bananas.1.3.25.nupkg'.
warn : No API Key was provided and no API Key could be found for 'http://gdk.test:8000/api/v4/projects/397/packages/nuget'. To save an API Key for a source use the 'setApiKey' command.
Pushing bananas.1.3.25.nupkg to 'http://gdk.test:8000/api/v4/projects/397/packages/nuget'...
PUT http://gdk.test:8000/api/v4/projects/397/packages/nuget/
Created http://gdk.test:8000/api/v4/projects/397/packages/nuget/ 1687ms
Your package was pushed.
Upload done.
Removing credentials for root
All done.
The package is properly processed by the background worker:
2021-05-25_13:12:14.14830 rails-background-jobs : {"severity":"INFO","time":"2021-05-25T13:12:14.147Z","class":"Packages::Nuget::ExtractionWorker","args":["1608"],"retry":3,"queue":"package_repositories:packages_nuget_extraction","backtrace":true,"version":0,"queue_namespace":"package_repositories","jid":"5d5e762c3741f020b3fd3e29","created_at":"2021-05-25T13:12:14.146Z","meta.user":"root","meta.caller_id":"PUT /api/:version/projects/:id/packages/nuget","meta.remote_ip":"127.0.0.1","meta.feature_category":"package_registry","meta.client_id":"user/1","correlation_id":"01F6HTEYYBXBWFK09B9G1EQ637","enqueued_at":"2021-05-25T13:12:14.147Z","job_size_bytes":6,"pid":91931,"message":"Packages::Nuget::ExtractionWorker JID-5d5e762c3741f020b3fd3e29: start","job_status":"start","scheduling_latency_s":0.000897}
2021-05-25_13:12:14.72273 rails-background-jobs : {"severity":"INFO","time":"2021-05-25T13:12:14.722Z","class":"Packages::Nuget::ExtractionWorker","args":["1608"],"retry":3,"queue":"package_repositories:packages_nuget_extraction","backtrace":true,"version":0,"queue_namespace":"package_repositories","jid":"5d5e762c3741f020b3fd3e29","created_at":"2021-05-25T13:12:14.146Z","meta.user":"root","meta.caller_id":"PUT /api/:version/projects/:id/packages/nuget","meta.remote_ip":"127.0.0.1","meta.feature_category":"package_registry","meta.client_id":"user/1","correlation_id":"01F6HTEYYBXBWFK09B9G1EQ637","enqueued_at":"2021-05-25T13:12:14.147Z","job_size_bytes":6,"pid":91931,"message":"Packages::Nuget::ExtractionWorker JID-5d5e762c3741f020b3fd3e29: done: 0.574444 sec","job_status":"done","scheduling_latency_s":0.000897,"db_count":14,"db_write_count":5,"db_cached_count":3,"external_http_count":6,"external_http_duration_s":0.015799999935552478,"cpu_s":0.044034,"duration_s":0.574444,"completed_at":"2021-05-25T13:12:14.722Z","db_duration_s":0.010136}
The package is properly available on the UI:
Everything is still working properly
Does this MR meet the acceptance criteria?
📐 Conformity
-
I have included a changelog entry. (Does this MR need a changelog?) - [-] I have added/updated documentation, or it's not needed. (Is documentation required?)
-
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?) - [-] I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?)
-
I have self-reviewed this MR per code review guidelines. -
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines) -
I have followed the style guides.
Availability and Testing
- [-] I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.)
- [-] I have tested this MR in all supported browsers, or it's not needed.
- [-] I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.
Security
Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team