CreatePackageService: remove `safe_find_or_create_by!` call
🌵 Context
In #338346 (closed), it has been discovered that create PG sub transactions can have a non trivial ~performance hit.
One of the source of those sub transactions is safe_find_or_create_by!
.
As such, #338947 (closed) has been opened to remove such calls from the Packages::Package
model used in the Package code area.
According to this comment, the only places where we use safe_find_or_create_by!
are:
app/services/packages/create_package_service.rb:11
app/services/packages/rubygems/create_dependencies_service.rb:24
app/services/packages/rubygems/metadata_extraction_service.rb:52
We will put app/services/packages/rubygems/*
to a side as this part is behind a feature flag and is not currently enabled on gitlab.com.
This MR will focuse on app/services/packages/create_package_service.rb:11
. The related function find_or_create_package!
is used in :
git grep -n -E 'find_or_create_package!'
app/services/packages/composer/create_package_service.rb:30: find_or_create_package!(:composer, name: package_name, version: package_version)
app/services/packages/create_package_service.rb:7: def find_or_create_package!(package_type, name: params[:name], version: params[:version])
app/services/packages/debian/find_or_create_incoming_service.rb:7: find_or_create_package!(:debian, name: 'incoming', version: nil)
app/services/packages/generic/find_or_create_package_service.rb:7: find_or_create_package!(::Packages::Package.package_types['generic'])
app/services/packages/pypi/create_package_service.rb:31: find_or_create_package!(:pypi)
So basically, packages composer
, generic
and pypi
. We will leave debian
to a side as it is also behind a feature flag and currently not enabled on gitlab.com.
💡 Solution
We're going to replace safe_find_or_create_by!
with find_or_create_by!
.
Why we can do so?
First, understand that find_or_create_by!
is not atomic which means that we could end up in two processes trying to insert the exact same package row.
To have this race condition issue, it would mean that two processes are uploading the same package with the same name, version and package to the same project. That situation is pretty rare.
Let's say that we have such conditions and two identical uploads hit the GitLab package registry. We will still run the rails model validation and as you can see, we have a model uniqueness constraint enforced using the name
, version
, package_type
and project_id
for composer
, generic
and pypi
packages. As such, one of those two uploads will get rejected (probably with a 400 Bad Request
response) and that is a reasonable outcome because the registry state is left in a coherent state.
🔍 What does this MR do?
- Replace
safe_find_or_create_by!
call withfind_or_create_by!
inapp/services/packages/create_package_service.rb
We did not update any rspec as safe_find_or_create_by!
and find_or_create_by!
has the same behavior from a caller point of view. That is also why, we don't use any feature flag here even though this change has an impact on 3 different registries.
🖼 Screenshots or Screencasts (strongly suggested)
See the next section
⚙ How to setup and validate locally (strongly suggested)
We're going to use https://gitlab.com/10io/gl_pru to quickly upload packages to a GitLab installation.
Have:
- a project
- personal access token
curl
-
python
,pip
andtwine
ready
With generic package
$ echo "bananas" > dummy.txt
$ curl --header "PRIVATE-TOKEN: <pat_token>" \
--upload-file ./dummy.txt \
"<gitlab_base_url>/api/v4/projects/<project_id>/packages/generic/bananas/1.2.3/file1.txt"
{"message":"201 Created"}
$ curl --header "PRIVATE-TOKEN: <pat_token>" \
--upload-file ./dummy.txt \
"<gitlab_base_url>/api/v4/projects/<project_id>/packages/generic/bananas/1.2.3/file2.txt"
{"message":"201 Created"}
Check the package registry at: <gitlab_base_url>/<project_path>/-/packages
. We have a single generic package named bananas
1.2.3
and it contains two files: file1.txt
and file2.txt
.
With composer package
- Checkout the project locally
- Add
composer.json
file to the root of the project with this content:{ "name": "<namespace_path>/bananas", "description": "Library XY", "type": "library", "license": "GPL-3.0-only", "authors": [ { "name": "John Doe", "email": "john@example.com" } ], "require": {} }
- Commit the changes
- Create a git tag:
$ git tag v1.2.4
- Push it:
$ git push origin v1.2.4
- Create the composer package:
$ curl --data tag=v1.2.4 "http://__token__:<pat_token>@<gitlab_base_url>/api/v4/projects/<project_id>/packages/composer" {"message":"201 Created"}
- Delete the git tag:
$ git tag -d v1.2.4
- Push the removal:
$ git push --delete origin v1.2.4
- Do some changes and commit them
- Create the tag again:
$ git tag v1.2.4
- Push it:
$ git push origin v1.2.4
- Recreate the same composer package:
$ curl --data tag=v1.2.4 "http://__token__:<pat_token>@<gitlab_base_url>/api/v4/projects/<project_id>/packages/composer" {"message":"201 Created"}
- Recreate again the same composer package
$ curl --data tag=v1.2.4 "http://__token__:<pat_token>@<gitlab_base_url>/api/v4/projects/<project_id>/packages/composer" {"message":"201 Created"}
Check the package registry at: <gitlab_base_url>/<project_path>/-/packages
. We have a single composer package named <namespace_path/root>/bananas
.
With pypi package
$ bundle exec thor package:push --package-type=pypi --user=root --token=XXX --url=<gitlab_base_url>/api/v4/projects/<project_id>/packages/pypi --name=bananas --version=2.3.7
This uploads succeeds
$ bundle exec thor package:push --package-type=pypi --user=root --token=XXX --url=<gitlab_base_url>/api/v4/projects/<project_id>/packages/pypi --name=bananas --version=2.3.7
This upload is rejected as the PyPI registry doesn't allow the same package to be uploaded twice.
Check the package registry at: <gitlab_base_url>/<project_path>/-/packages
. We have a single pypi package named bananas
2.3.7
.
📏 Does this MR meet the acceptance criteria?
Conformity
-
I have included changelog trailers, or none are needed. (Does this MR need a changelog?) - [-] I have added/updated documentation, or it's not needed. (Is documentation required?)
- [-] I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?)
- [-] I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?)
-
I have self-reviewed this MR per code review guidelines. -
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines) -
I have followed the style guides. -
This change is backwards compatible across updates, or this does not apply.
Availability and Testing
-
I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.) - [-] I have tested this MR in all supported browsers, or it's not needed.
- [-] I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.
Security
Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team