Disable sticky writes in the PAT last used service
🔭 Context
In #462379 (closed), we noticed that a PyPI Repository API endpoint was triggered a very high amount of queries on the primary database.
Turns out that, it was a n+1
situation. This problem is being handled in !153444 (merged).
With a further discussion, we challenged why the primary was used in a GET
API endpoint.
Upon further check, we found out that, when using personal access tokens, sometimes, the read queries would hit the primary. Looking at the backend logs, the personal access token access would trigger an UPDATE
statement on the last_used_at
column and this would trigger sticking to the primary sticking.
This doesn't seem right. For a GET
API request (a read-only request), we should strive to read off replicas unless we are in a specific situation that requires to read from the primary.
This MR implements this change with a feature flag. Rollout issue: [Feature flag] Rollout of `disable_sticky_write... (#462823 - closed).
🤔 What does this MR do and why?
- Update
app/services/personal_access_tokens/last_used_service.rb
to not have sticky writes. - Update the related specs.
🤔 MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
🌈 Screenshots or screen recordings
⚙ How to set up and validate locally
We're going to use the pypi package registry to illustrate the situation.
- Create project.
- Have a
dummy.txt
file andcurl --header "PRIVATE-TOKEN: <PAT>" --upload-file ./dummy.txt "http://gdk.test:8000/api/v4/projects/<project_id>/packages/generic/my_package/0.0.1/file.txt"
- In
generic_packages.rb
, before this line, put this statement:
Rails.logger.debug("Use primary?: #{Gitlab::Database::LoadBalancing::Session.current.use_primary?}")
Let's pull the file and see the log.
1️⃣ On master
$ curl --header "PRIVATE-TOKEN: <PAT>" "http://gdk.test:8000/api/v4/projects/<project_id>/packages/generic/my_package/0.0.1/file.txt"
Check log/development.log
and you will see:
[...]
PersonalAccessToken Update (0.2ms) UPDATE "personal_access_tokens" SET "last_used_at" = '2024-05-21 13:50:40.013376' WHERE "personal_access_tokens"."id" = 74 /*application:web,correlation_id:01HYDPP3JSF4XGTZ4QVWS5025S,endpoint_id:GET /api/:version/projects/:id/packages/generic/:package_namepackage_version/:file_name,db_config_name:main,line:/app/services/personal_access_tokens/last_used_service.rb:15:in `block in execute'*/
[...]
Use primary?: true
2️⃣ With this MR
Let's reset the last_used_at
of the PAT since it will get updated only every 10 minutes. In a rails console:
PersonalAccessToken.find(<PAT id>).update(last_used_at: 3.weeks.ago)
Enable the feature flag : Feature.enable(:disable_sticky_writes_for_pat_last_used)
.
Let's pull the file again:
$ curl --header "PRIVATE-TOKEN: <PAT>" "http://gdk.test:8000/api/v4/projects/<project_id>/packages/generic/my_package/0.0.1/file.txt"
and check the log:
[...]
PersonalAccessToken Update (0.2ms) UPDATE "personal_access_tokens" SET "last_used_at" = '2024-05-21 13:50:40.013376' WHERE "personal_access_tokens"."id" = 74 /*application:web,correlation_id:01HYDPP3JSF4XGTZ4QVWS5025S,endpoint_id:GET /api/:version/projects/:id/packages/generic/:package_namepackage_version/:file_name,db_config_name:main,line:/app/services/personal_access_tokens/last_used_service.rb:15:in `block in execute'*/
[...]
Use primary?: false
Success last_used_at
column didn't trigger the sticky write.