Optimize the package finder helper when dealing with deploy tokens [RUN ALL RSPEC] [RUN AS-IF-FOSS]
🍌 Context
The maven package finder is heavily used by the maven package registry.
We noticed that at the group level, some requests were having horrible response times.
Our analysis found out an issue when the user uses deploy tokens to authenticate with the maven package registry.
Basically, what happens is that Active Record seems to have a hard time to merge two scopes with identical conditions and we end up with a SQL query similar to this one:
WITH "maven_metadata_by_path" AS (
SELECT "packages_maven_metadata"."id",
"packages_maven_metadata"."package_id"
FROM "packages_maven_metadata"
WHERE "packages_maven_metadata"."path" = 'gl/pru/maven_pkg_01_02_03_04_05_06/0.7.4'
) SELECT "packages_packages".*
FROM "packages_packages"
INNER JOIN maven_metadata_by_path
ON maven_metadata_by_path.package_id=packages_packages.id
WHERE "packages_packages"."project_id" IN (
SELECT "projects"."id"
FROM "projects"
WHERE "projects"."namespace_id" IN (
SELECT "id"
FROM (
SELECT "namespaces".*
FROM "namespaces"
INNER JOIN (
SELECT "id",
"depth"
FROM (
WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT DISTINCT "namespaces".*,
ROW_NUMBER() OVER () AS depth
FROM "base_and_descendants" AS "namespaces"
) AS "namespaces"
WHERE "namespaces"."type" = 'Group'
) namespaces_join_table
ON namespaces_join_table.id = namespaces.id
WHERE "namespaces"."type" = 'Group'
ORDER BY "namespaces_join_table"."depth" ASC
) AS "namespaces"
WHERE "namespaces"."type" = 'Group'
)
AND "projects"."namespace_id" IN (
SELECT id
FROM (
SELECT "namespaces".*
FROM "namespaces"
INNER JOIN (
SELECT "id",
"depth"
FROM (
WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT DISTINCT "namespaces".*,
ROW_NUMBER() OVER () AS depth
FROM "base_and_descendants" AS "namespaces"
) AS "namespaces"
WHERE "namespaces"."type" = 'Group'
) namespaces_join_table
ON namespaces_join_table.id = namespaces.id
WHERE "namespaces"."type" = 'Group'
ORDER BY "namespaces_join_table"."depth" ASC
) AS "namespaces"
WHERE "namespaces"."type" = 'Group'
)
)
ORDER BY "packages_packages"."id" DESC
LIMIT 1
If you pay close attention, we have a duplicated condition.
This MR is part of the improvements described in issue #325869 (closed).
🔬 What does this MR do?
- Simplify some scopes by using
Namespace#all_projects
.- This does not change the generated SQL query. It's merely to have a more readable code.
- When a deploy token is used, update the function that returns the available projects to the user by using
DeployToken#accessible_projects
.- The current code checks the projects using a minimum role but this logic can't be applied to deploy tokens as they are linked to groups directly. There is no notion of minimum role.
- Actually, we're using the exact same function that
Project.public_or_visible_to_user
uses
The maven package registry being one of the most used regisitries, this change is behind a feature flag to have an additional safety net. Here is the tracking issue: #326808 (closed)
🖼 Screenshots (strongly suggested)
n / a
📏 Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because it is behind a feature flag.
-
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
- [-] Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
- [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
💽 Database review
For the explain plans of this MR, we're going to that these feature flag are enabled:
use_distinct_for_all_object_hierarchy
maven_metadata_by_path_with_optimization_fence
maven_packages_group_level_improvements
Those are past improvements that have consequences to the generate SQL queries by the maven package finder. They are all currently enabled on gitlab.com.
See the notes for the database review: