Manage packages_size statistic with a counter attribute
⏳ Context
In the usage quota page, one metric taken into account is the size of the Package Registry. Basically, how much storage each package is using.
This statistic is linked to the attribute size
of ::Packages::PackageFile
.
On master
, the statistic is updated in the following way:
-
Rails callbacks are registered so that
after_save
orafter_destroy
, we take action. - Those callbacks will call
ProjectStatistics.increment_statistic
with the properamount
(which can be negative). - After chaining several functions, we end up using
.update_counters
from rails.
(3.) is done outside of a database transaction. The reason of why it is outside a transaction, it's because:
@fabiopitino
said:
The reason why we moved the increment in a separate transaction is because project_statistics is a highly contended table and before it was causing a lot of statement timeout errors on many concurrent updates. By moving it to a different transaction (after_commit) we separated the main transaction (add/remove model) from the side-effect (update statistics) and made the main transaction more resilient. See !20852 (merged) for context.
The problem is that being outside a transaction, we introduced a race condition risk.
Add lease to update project statistics row and ... (!97912 - merged) improved the monitoring around those statistics updates. Among other things, we now detect concurrent updates. Guess who is the main culprit here? Yes, packages_size
Because of those concurrent updates, we are noticing a loss of accuracy in the usage quota page where the packages_size
metric is no longer the sum of all package files sizes. This is issue #363010 (closed).
🌬 The solution
Basically, the solution used in this MR is to avoid or at least lower those concurrent updates. We already have a tool in place for this: CounterAttribute
.
In very short words, a counter attribute will "stack" the counter updates in Redis and enqueue a job that will run in 10.minutes
. That job will "simply" flush the counter update from Redis to the database.
This works because in Redis, we have means to guarantee that we will never have concurrent updates.
This MR is thus as simple as move packages_size
updates to a CounterAttribute
.
🤔 What does this MR do and why?
- Declare
packages_size
inProjectStatistics
as acounter_attribute
. - Add a feature flag support when updating the
packages_size
so we can still decide if the update is sync (old approach) or async/delayed (new approach).- Rollout issue: #381287 (closed)
- Update/Create related specs.
📺 Screenshots or screen recordings
None
⚗ How to set up and validate locally
- Have GDK ready with one project and a Personal Access Token.
- To keep things simple, we're going to use the generic package registry. With a terminal, let's create 5 packages:
$ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic1/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic2/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic3/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic4/1.1.2/file.txt" $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic5/1.1.2/file.txt"
- Now check the project usage quota page
http://gdk.test:8000/<project full path>/-/usage_quotas
:
Everything is setup properly.
Let's have a run without the feature flag enabled.
- Delete all packages in a rails console:
Project.last.packages.destroy_all
- In the rails console, you should see these SQL queries:
ProjectStatistics Update All (1.1ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.8ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.5ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.5ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/ ProjectStatistics Update All (0.5ms) UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/
- Check the usage quota page again, it's down to
0 bytes
.
Ok, that's the "synchronous" packages_size
updates.
Let's re upload 5 packages (step 2 from our setup above). Let's enable the feature flag now:
Feature.enable(:packages_size_counter_attribute)
- (Make sure that you have background jobs running!)
- Delete all packages in a rails console:
Project.last.packages.destroy_all
- This time around, no SQL updates on
project_statistics
. - While we wait the flush job to kick in (10 minutes), we can check that we have the updates in Redis:
ps = Project.last.statistics key = ps.counter_key(:packages_size) Gitlab::Redis::SharedState.with { |r| r.get(key) } => "-40"
- This means that our
-40
update topackages_size
is waiting for the job. Note that while in this waiting state, any update onpackages_size
will affect this redis key (eg. we upload a file of100
bytes, that key will get updated to60
(-40 + 100
)).
- This means that our
- Also while waiting, check the usage quota page. It still shows
40 bytes
. - (After waiting 10 minutes) The job runs and the
packages_size
is updated accordingly. The redis key is gone (nil
) and the usage quota page is updated accordingly.- Even if this is a very small example, we just combined
5
UPDATE
statements into a single one = less chances to have concurrent updates.
- Even if this is a very small example, we just combined
Async packages_size
updates are working properly!
🚦 MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.