Skip to content

Improve GQL query for fetching groups and projects in token allowlist

Mireya Andres requested to merge job-token-search-limitation-v2 into master

What does this MR do and why?

Implements #473747 (closed)

Problem

When adding a group or project to the allowlist for CI_JOB_TOKEN, the search is limited to showing up to 4 items in the results.

This means that some groups/projects will not show up if there are more than 4 groups/projects with the same name or path. In some cases, it becomes impossible to add a parent subgroup, even when typing the full path of the parent group.

Proposed Solution

The previous implementation of this issue was reverted since the query resulted in 503 errors (see #473747 (comment 2076629688)). This is likely due to the inclusion of searchNamespaces: true, making the query expensive to run. See the query plan below, where the query takes around 78s to execute.

In this MR, we removed searchNamespaces: true and instead sorted by id and increased the limit of the search results from 4 to 10. sort: id_asc will help prioritize groups that were created first. It is more likely that groups with shorter full paths will show up in the results this way, though this will not be guaranteed.

Generated SQL

Current SQL

For groups: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31176/commands/96847

Time: 5.681 s
  - planning: 6.066 ms
  - execution: 5.675 s
    - I/O read: 5.230 s
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 8689 (~67.90 MiB) from the buffer pool
  - reads: 8100 (~63.30 MiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
SELECT
    "namespaces"."id"
FROM
    "namespaces"
WHERE
    "namespaces"."type" = 'Group'
    AND "namespaces"."type" != 'Project'
    AND ("namespaces"."path" ILIKE '%$1%'
        OR "namespaces"."name" ILIKE '%$1%')
LIMIT 4;

For projects: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31176/commands/96848

Time: 15.077 ms
  - planning: 8.889 ms
  - execution: 6.188 ms
    - I/O read: 5.555 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 0 from the buffer pool
  - reads: 8 (~64.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 1 (~8.00 KiB)
  - writes: 0
SELECT
    "projects"."id"
FROM
    "projects"
WHERE (("projects"."path" ILIKE '%$1%'
        OR "projects"."name" ILIKE '%$1%')
    OR "projects"."description" ILIKE '%$1%')
LIMIT 4;
Proposed Changes

(Limit increased to 10, sorted by id_asc)

For groups: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31176/commands/96849

Time: 1.101 s
  - planning: 7.996 ms
  - execution: 1.093 s
    - I/O read: 11.767 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 16793 (~131.20 MiB) from the buffer pool
  - reads: 589 (~4.60 MiB) from the OS file cache, including disk I/O
  - dirtied: 5 (~40.00 KiB)
  - writes: 0
SELECT
    "namespaces"."id"
FROM
    "namespaces"
WHERE
    "namespaces"."type" = 'Group'
    AND "namespaces"."type" != 'Project'
    AND ("namespaces"."path" ILIKE '%$1%'
        OR "namespaces"."name" ILIKE '%$1%')
ORDER BY
    "namespaces"."id" ASC
LIMIT 10;

For projects: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31176/commands/96850

Time: 10.706 ms
  - planning: 9.157 ms
  - execution: 1.549 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 492 (~3.80 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
SELECT
    "projects"."id"
FROM
    "projects"
WHERE (("projects"."path" ILIKE '%$1%'
        OR "projects"."name" ILIKE '%$1%')
    OR "projects"."description" ILIKE '%$1%')
ORDER BY
    "projects"."id" ASC
LIMIT 10;
Reverted Changes

(Sorted by id_asc; projects have searchNamespaces: true)

For groups: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31176/commands/96853

Time: 1.086 min
  - planning: 6.214 ms
  - execution: 1.086 min
    - I/O read: 1.068 min
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 17260 (~134.80 MiB) from the buffer pool
  - reads: 99321 (~775.90 MiB) from the OS file cache, including disk I/O
  - dirtied: 1080 (~8.40 MiB)
  - writes: 0
SELECT
    "namespaces"."id"
FROM
    "namespaces"
WHERE
    "namespaces"."type" = 'Group'
    AND "namespaces"."type" != 'Project'
    AND ("namespaces"."path" ILIKE '%$1%'
        OR "namespaces"."name" ILIKE '%$1%')
ORDER BY
    "namespaces"."id" ASC
LIMIT 4;

For projects: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31176/commands/96860

Project.select('id').search("gitlab", include_namespace: true).order_by("id_asc").limit(4).to_sql
Time: 78.163 ms
  - planning: 14.152 ms
  - execution: 64.011 ms
    - I/O read: 62.720 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 107 (~856.00 KiB) from the buffer pool
  - reads: 45 (~360.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
SELECT
    "projects"."id"
FROM
    "projects"
    INNER JOIN "routes" ON "routes"."source_type" = 'Project'
        AND "routes"."source_id" = "projects"."id"
WHERE (("routes"."path" ILIKE '%$1%'
        OR "routes"."name" ILIKE '%$1%')
    OR "projects"."description" ILIKE '%$1%')
ORDER BY
    "projects"."id" ASC
LIMIT 4;

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Search Term Before After
cookie before_cookie after_cookie
dessert before_dessert after_dessert
dessert/dessert before_dessert_dessert after_dessert_dessert

|

How to set up and validate locally

  1. Create more than 4 subgroups, each one nested by the previous. Create one group for each subgroup, then a subgroup for each subgroup. Make sure the subgroups have the same name, and the projects also have the same name. You should have a setup like the following:
    • group/subgroup1/project
    • group/subgroup1/subgroup/project
    • group/subgroup1/subgroup/subgroup/project
    • group/subgroup1/subgroup/subgroup/subgroup/project
    • group/subgroup1/subgroup/subgroup/subgroup/subgroup/project
  2. Go to the CI/CD Settings and expand Job token permissions
  3. Click Add group or project and try to search for the project name, project path, subgroup name, and subgroup path. You should have better search results than before the GQL was updated.
Edited by Mireya Andres

Merge request reports

Loading