Skip to content

Put exact matches first when searching for groups to transfer

Stan Hu requested to merge sh-group-transfer-exact-matches into master

What does this MR do and why?

When transferring a group to another group, previously Groups::AcceptingGroupTransfersFinder#execute would return a list of accessible groups ordered by the group's path. Suppose you had a number of groups with the paths:

  1. gitlab-org
  2. gitlab-org/a/1
  3. gitlab-org/1

Groups 2 and 3 would always be prioritized ahead of 1 since the ORDER BY path would put them ahead. When there are thousands of groups, that makes it almost impossible to locate the exact match, gitlab-org.

This commit introduces the ability for fuzzy_search to return items that match exactly first when the exact_matches_first flag is enabled. Groups::AcceptingGroupTransfersFinder#execute uses this parameter selectively for now. We might consider turning this on for all searches.

The exact_matches_first_group_transfer feature flag controls the rollout of this change.

The query plan does not seem to be affected because when a search parameter is available, the fuzzy_search ORDER BY is used instead of the default ORDER BY path ASC, id ASC query.

Relates to #423958 (closed)

SQL statements

Before

When entering gitlab-org for https://gitlab.com/groups/fulfillment-group/-/edit:

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/21897/commands/70929

Time: 520.435 ms  
  - planning: 6.344 ms  
  - execution: 514.091 ms  
    - I/O read: 0.000 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 24208 (~189.10 MiB) from the buffer pool  
  - reads: 0 from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  
SELECT "namespaces".*
FROM (
        (WITH "descendants_base_cte" AS MATERIALIZED
           (SELECT "namespaces"."id", "namespaces"."traversal_ids"
            FROM "namespaces"
            INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
            WHERE "members"."type" = 'GroupMember'
              AND "members"."source_type" = 'Namespace'
              AND "namespaces"."type" = 'Group'
              AND "members"."user_id" = 64248
              AND "members"."requested_at" IS NULL
              AND (access_level >= 10)
              AND (members.access_level >= 50)),
              "superset" AS
           (SELECT d1.traversal_ids
            FROM descendants_base_cte d1
            WHERE NOT EXISTS
                (SELECT 1
                 FROM descendants_base_cte d2
                 WHERE d2.id = ANY(d1.traversal_ids)
                   AND d2.id <> d1.id ) ) SELECT DISTINCT "namespaces".*
         FROM "superset",
              "namespaces"
         WHERE "namespaces"."type" = 'Group'
           AND next_traversal_ids_sibling("superset"."traversal_ids") > "namespaces"."traversal_ids"
           AND "superset"."traversal_ids" <= "namespaces"."traversal_ids"
           AND "namespaces"."id" NOT IN (59805804,
                                         52306274,
                                         58082419,
                                         57662085,
                                         13703656,
                                         13703648,
                                         57704811,
                                         57705888,
                                         57705942,
                                         57704865,
                                         58073239,
                                         57696357,
                                         57706361,
                                         10604555,
                                         57706471,
                                         14978359,
                                         16319153,
                                         5595069,
                                         58082414,
                                         61003997,
                                         57696361))
      UNION
        (WITH "descendants_base_cte" AS MATERIALIZED
           (SELECT "namespaces"."id", "namespaces"."traversal_ids"
            FROM "namespaces"
            WHERE "namespaces"."type" = 'Group'
              AND "namespaces"."id" IN
                (SELECT "group_group_links"."shared_group_id"
                 FROM "group_group_links"
                 WHERE "group_group_links"."group_access" = 50
                   AND "group_group_links"."shared_with_group_id" IN
                     (SELECT "namespaces"."id"
                      FROM "namespaces"
                      INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
                      WHERE "members"."type" = 'GroupMember'
                        AND "members"."source_type" = 'Namespace'
                        AND "namespaces"."type" = 'Group'
                        AND "members"."user_id" = 64248
                        AND "members"."requested_at" IS NULL
                        AND (access_level >= 10)
                        AND "members"."access_level" = 50))),
              "superset" AS
           (SELECT d1.traversal_ids
            FROM descendants_base_cte d1
            WHERE NOT EXISTS
                (SELECT 1
                 FROM descendants_base_cte d2
                 WHERE d2.id = ANY(d1.traversal_ids)
                   AND d2.id <> d1.id ) ) SELECT DISTINCT "namespaces".*
         FROM "superset",
              "namespaces"
         WHERE "namespaces"."type" = 'Group'
           AND next_traversal_ids_sibling("superset"."traversal_ids") > "namespaces"."traversal_ids"
           AND "superset"."traversal_ids" <= "namespaces"."traversal_ids"
           AND "namespaces"."id" NOT IN (59805804,
                                         52306274,
                                         58082419,
                                         57662085,
                                         13703656,
                                         13703648,
                                         57704811,
                                         57705888,
                                         57705942,
                                         57704865,
                                         58073239,
                                         57696357,
                                         57706361,
                                         10604555,
                                         57706471,
                                         14978359,
                                         16319153,
                                         5595069,
                                         58082414,
                                         61003997,
                                         57696361))) namespaces
WHERE "namespaces"."type" != 'Project'
  AND "namespaces"."id" IN
    (SELECT "routes"."source_id"
     FROM "routes"
     WHERE "routes"."source_type" = 'Namespace'
       AND ("routes"."path" ILIKE '%gitlab-org%'
            OR "routes"."name" ILIKE '%gitlab-org%'))
ORDER BY "namespaces"."path" ASC,
         "namespaces"."id" ASC
LIMIT 20
OFFSET 0

After

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/21897/commands/70928:

Time: 511.500 ms  
  - planning: 6.930 ms  
  - execution: 504.570 ms  
    - I/O read: 0.000 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 24208 (~189.10 MiB) from the buffer pool  
  - reads: 0 from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  
SELECT "namespaces".*
FROM (
        (WITH "descendants_base_cte" AS MATERIALIZED
           (SELECT "namespaces"."id", "namespaces"."traversal_ids"
            FROM "namespaces"
            INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
            WHERE "members"."type" = 'GroupMember'
              AND "members"."source_type" = 'Namespace'
              AND "namespaces"."type" = 'Group'
              AND "members"."user_id" = 64248
              AND "members"."requested_at" IS NULL
              AND (access_level >= 10)
              AND (members.access_level >= 50)),
              "superset" AS
           (SELECT d1.traversal_ids
            FROM descendants_base_cte d1
            WHERE NOT EXISTS
                (SELECT 1
                 FROM descendants_base_cte d2
                 WHERE d2.id = ANY(d1.traversal_ids)
                   AND d2.id <> d1.id ) ) SELECT DISTINCT "namespaces".*
         FROM "superset",
              "namespaces"
         WHERE "namespaces"."type" = 'Group'
           AND next_traversal_ids_sibling("superset"."traversal_ids") > "namespaces"."traversal_ids"
           AND "superset"."traversal_ids" <= "namespaces"."traversal_ids"
           AND "namespaces"."id" NOT IN (59805804,
                                         52306274,
                                         58082419,
                                         57662085,
                                         13703656,
                                         13703648,
                                         57704811,
                                         57705888,
                                         57705942,
                                         57704865,
                                         58073239,
                                         57696357,
                                         57706361,
                                         10604555,
                                         57706471,
                                         14978359,
                                         16319153,
                                         5595069,
                                         58082414,
                                         61003997,
                                         57696361))
      UNION
        (WITH "descendants_base_cte" AS MATERIALIZED
           (SELECT "namespaces"."id", "namespaces"."traversal_ids"
            FROM "namespaces"
            WHERE "namespaces"."type" = 'Group'
              AND "namespaces"."id" IN
                (SELECT "group_group_links"."shared_group_id"
                 FROM "group_group_links"
                 WHERE "group_group_links"."group_access" = 50
                   AND "group_group_links"."shared_with_group_id" IN
                     (SELECT "namespaces"."id"
                      FROM "namespaces"
                      INNER JOIN "members" ON "namespaces"."id" = "members"."source_id"
                      WHERE "members"."type" = 'GroupMember'
                        AND "members"."source_type" = 'Namespace'
                        AND "namespaces"."type" = 'Group'
                        AND "members"."user_id" = 64248
                        AND "members"."requested_at" IS NULL
                        AND (access_level >= 10)
                        AND "members"."access_level" = 50))),
              "superset" AS
           (SELECT d1.traversal_ids
            FROM descendants_base_cte d1
            WHERE NOT EXISTS
                (SELECT 1
                 FROM descendants_base_cte d2
                 WHERE d2.id = ANY(d1.traversal_ids)
                   AND d2.id <> d1.id ) ) SELECT DISTINCT "namespaces".*
         FROM "superset",
              "namespaces"
         WHERE "namespaces"."type" = 'Group'
           AND next_traversal_ids_sibling("superset"."traversal_ids") > "namespaces"."traversal_ids"
           AND "superset"."traversal_ids" <= "namespaces"."traversal_ids"
           AND "namespaces"."id" NOT IN (59805804,
                                         52306274,
                                         58082419,
                                         57662085,
                                         13703656,
                                         13703648,
                                         57704811,
                                         57705888,
                                         57705942,
                                         57704865,
                                         58073239,
                                         57696357,
                                         57706361,
                                         10604555,
                                         57706471,
                                         14978359,
                                         16319153,
                                         5595069,
                                         58082414,
                                         61003997,
                                         57696361))) namespaces
WHERE "namespaces"."type" != 'Project'
  AND "namespaces"."id" IN
    (SELECT "routes"."source_id"
     FROM "routes"
     WHERE "routes"."source_type" = 'Namespace'
       AND ("routes"."path" ILIKE '%gitlab-org%'
            OR "routes"."name" ILIKE '%gitlab-org%')
     ORDER BY CASE
                  WHEN PATH ILIKE 'gitlab-org' THEN 1
                  WHEN name ILIKE 'gitlab-org' THEN 1
                  ELSE 2
              END)
LIMIT 20
OFFSET 0

How to set up and validate locally

  1. In some group X (such as sandbox1), create subgroup aa.
  2. Now create subgroups X/aa/11, X/aa/11/11, and X/aa/11/11/11.
  3. Go to group X/aa/11/11/11 and attempt to transfer the group inside Settings -> General -> Advanced -> Select a new namespace. Type X/aa in the Search dropdown.
  4. Notice the list is sort so that all the subgroups with 11 will be listed first:

image

  1. Now enable :exact_matches_first_group_transfer via bin/rails console: Feature.enable(:exact_matches_first_group_transfer).
  2. Reload the transfer page, enter in X/aa in the dropdown. Now the subgroup X/aa should be listed first:

image

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports

Loading