Calculate total excess storage for a namespace
What does this MR do?
In order to enforce the new repository size rules allowing customers to purchase additional storage (as defined in the &4237), we need to be able to calculate the total excess repository size of all the projects in the root namespace. This MR introduces a method, total_repository_size_excess
, accomplishes through one query on the namespace.
As an example, here's the method used in a rails console locally, along with the raw query:
[19] pry(main)> root_namespace.total_repository_size_excess
ApplicationSetting Load (2.0ms) SELECT "application_settings".* FROM "application_settings" ORDER BY "application_settings"."id" DESC LIMIT $1 [["LIMIT", 1]]
(6.4ms) SELECT SUM(("project_statistics"."repository_size" - "projects"."repository_size_limit")) FROM "projects" INNER JOIN routes rs ON rs.source_id = projects.id AND rs.source_type = 'Project' INNER JOIN "project_statistics" ON "project_statistics"."project_id" = "projects"."id" WHERE (rs.path LIKE 'test-group-01/%') AND (projects.repository_size_limit != 0 AND project_statistics.repository_size > projects.repository_size_limit)
(1.2ms) SELECT SUM(("project_statistics"."repository_size" - 10485760000)) FROM "projects" INNER JOIN routes rs ON rs.source_id = projects.id AND rs.source_type = 'Project' INNER JOIN "project_statistics" ON "project_statistics"."project_id" = "projects"."id" WHERE (rs.path LIKE 'test-group-01/%') AND (projects.repository_size_limit IS NULL AND project_statistics.repository_size > 10485760000)
Database Review Notes
First query for projects with a set repository limit:
Formatted Query:
SELECT
SUM(project_statistics.repository_size - projects.repository_size_limit) AS total_excess
FROM
"projects"
INNER JOIN routes rs ON rs.source_id = projects.id
AND rs.source_type = 'Project'
INNER JOIN "project_statistics" ON "project_statistics"."project_id" = "projects"."id"
WHERE (rs.path LIKE 'gbobject/%')
AND projects.repository_size_limit != 0 AND project_statistics.repository_size > projects.repository_size_limit
https://explain.depesz.com/s/7ns
Explain Output:Aggregate (cost=5335.62..5335.63 rows=1 width=32)
-> Nested Loop (cost=1.43..5335.61 rows=1 width=16)
Join Filter: (project_statistics.repository_size > projects.repository_size_limit)
-> Nested Loop (cost=1.00..5332.16 rows=1 width=16)
-> Index Scan using index_routes_on_path_text_pattern_ops on routes rs (cost=0.56..3.59 rows=1554 width=4)
Index Cond: (((path)::text ~>=~ 'gbobject/'::text) AND ((path)::text ~<~ 'gbobject0'::text))
Filter: (((path)::text ~~ 'gbobject/% '::text) AND ((source_type)::text = 'Project'::text))
-> Index Scan using projects_pkey on projects (cost=0.43..3.43 rows=1 width=12)
Index Cond: (id = rs.source_id)
Filter: (repository_size_limit <> 0)
-> Index Scan using index_project_statistics_on_project_id on project_statistics (cost=0.43..3.44 rows=1 width=12)
Index Cond: (project_id = rs.source_id)
Summary:
First attempt:
Time: 10.580 s
- planning: 1.480 ms
- execution: 10.579 s
- I/O read: 10.438 s
- I/O write: 0.000 ms
Shared buffers:
- hits: 30270 (~236.50 MiB) from the buffer pool
- reads: 9127 (~71.30 MiB) from the OS file cache, including disk I/O
- dirtied: 186 (~1.50 MiB)
- writes: 0
Second attempt:
Time: 37.127 ms
- planning: 1.646 ms
- execution: 35.481 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 39397 (~307.80 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Second query for projects that fall back to the namespace repository limit (example with 10 GB):
Formatted Query:
EXPLAIN SELECT
SUM(project_statistics.repository_size - 10485760000) AS total_excess
FROM
"projects"
INNER JOIN routes rs ON rs.source_id = projects.id
AND rs.source_type = ‘Project’
INNER JOIN "project_statistics" ON "project_statistics"."project_id" = "projects"."id"
WHERE (rs.path LIKE 'gbobject/% ')
AND projects.repository_size_limit IS NULL AND project_statistics.repository_size > 10485760000
https://explain.depesz.com/s/4rl3
Explain Output:Aggregate (cost=150.07..150.08 rows=1 width=32)
-> Nested Loop (cost=24.01..150.06 rows=1 width=8)
Join Filter: (rs.source_id = projects.id)
-> Hash Join (cost=23.57..146.65 rows=1 width=16)
Hash Cond: (project_statistics.project_id = rs.source_id)
-> Index Only Scan using index_project_statistics_on_repository_size_and_project_id on project_statistics (cost=0.56..115.02 rows=3286 width=12)
Index Cond: (repository_size > '10485760000'::bigint)
-> Hash (cost=3.59..3.59 rows=1554 width=4)
-> Index Scan using index_routes_on_path_text_pattern_ops on routes rs (cost=0.56..3.59 rows=1554 width=4)
Index Cond: (((path)::text ~>=~ 'gbobject/'::text) AND ((path)::text ~<~ 'gbobject0'::text))
Filter: (((path)::text ~~ 'gbobject/% '::text) AND ((source_type)::text = 'Project'::text))
-> Index Scan using projects_pkey on projects (cost=0.43..3.40 rows=1 width=4)
Index Cond: (id = project_statistics.project_id)
Filter: (repository_size_limit IS NULL)
Summary:
First attempt:
Time: 497.272 ms
- planning: 1.719 ms
- execution: 495.553 ms
- I/O read: 443.846 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 30275 (~236.50 MiB) from the buffer pool
- reads: 9130 (~71.30 MiB) from the OS file cache, including disk I/O
- dirtied: 186 (~1.50 MiB)
- writes: 0
Second attempt:
Time: 42.985 ms
- planning: 1.625 ms
- execution: 41.360 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 39402 (~307.80 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Relates to #255346