Skip to content

Fix for Runner#belonging_to_parent_group_of_project cross dabatase query

Ref: #339937 (closed)

What does this MR do and why?

This MR fixes a cross database query between our 2 new databases when fetching group runners of a given project.

This change is behind a feature flag : ci_decompose_belonging_to_parent_group_of_project_query.

Why are we doing this?

As explained on our multi databases documentation it will not be possible to make a join query accessing our 2 databases.

This is why need to modify our existing SQL query which fetch group runners for a project.

To solve this problem we are leveraging ci_(project|namespace)_mirrors to avoid querying 2 different databases. We are now making 2 SQL queries instead of one, opened to suggestion if there is a better way of doing this 🙇‍♂️

Database review

Note: As we are still backfilling those new tables, here the SQL plan which is not relevant real traffic yet, will update the MR description once the data is available.

Used @furkanayhan's suggestion to benchmark with real data:

INSERT INTO ci_project_mirrors (project_id, namespace_id)
SELECT "projects"."id", "projects"."namespace_id" FROM "projects"
ON CONFLICT (project_id) DO NOTHING;

INSERT INTO ci_namespace_mirrors (namespace_id, traversal_ids)
SELECT "namespaces"."id", "namespaces"."traversal_ids" FROM "namespaces"
ON CONFLICT (namespace_id) DO NOTHING;

ANALYZE ci_project_mirrors;
ANALYZE ci_namespace_mirrors;

When FF use_traversal_ids... ENABLED

Old SQL Query:
SELECT "ci_runners".*
FROM "ci_runners"
INNER JOIN "ci_runner_namespaces" ON "ci_runner_namespaces"."runner_id" = "ci_runners"."id"
INNER JOIN "namespaces" ON "namespaces"."id" = "ci_runner_namespaces"."namespace_id"
AND "namespaces"."type" = 'Group'
WHERE "namespaces"."id" IN (
  WITH "base_ancestors_cte" AS MATERIALIZED (
    SELECT "namespaces"."id", "namespaces"."traversal_ids"
    FROM "namespaces"
    INNER JOIN "projects" ON "projects"."namespace_id" = "namespaces"."id"
    WHERE "namespaces"."type" = 'Group' AND "projects"."id" = 278964
  ), "ancestors_cte" AS MATERIALIZED
  (
    SELECT id AS base_id, unnest(traversal_ids) AS ancestor_id FROM "base_ancestors_cte"
  )
  SELECT DISTINCT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
  FROM "ancestors_cte", "namespaces"
  WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "ancestors_cte"."ancestor_id"
);

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7848/commands/28036

New SQL Query:
WITH "base_ancestors_cte" AS MATERIALIZED (
  SELECT "namespaces"."id", "namespaces"."traversal_ids"
  FROM "namespaces"
  INNER JOIN "projects" ON "projects"."namespace_id" = "namespaces"."id"
  WHERE "namespaces"."type" = 'Group' AND "projects"."id" = 278964
), "ancestors_cte" AS MATERIALIZED
(
  SELECT id AS base_id, unnest(traversal_ids) AS ancestor_id FROM "base_ancestors_cte"
)
SELECT DISTINCT "id"
FROM "ancestors_cte", "namespaces"
WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "ancestors_cte"."ancestor_id";

--

SELECT "ci_runners".*
FROM "ci_runners"
INNER JOIN "ci_runner_namespaces" ON "ci_runner_namespaces"."runner_id" = "ci_runners"."id"
WHERE "ci_runner_namespaces"."namespace_id" = 9970;

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7848/commands/28037 https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7848/commands/28038

When FF use_traversal_ids... DISABLED

Old SQL Query:
SELECT "ci_runners".*
FROM "ci_runners"
INNER JOIN "ci_runner_namespaces" ON "ci_runner_namespaces"."runner_id" = "ci_runners"."id"
INNER JOIN "namespaces" ON "namespaces"."id" = "ci_runner_namespaces"."namespace_id" AND "namespaces"."type" = 'Group'
WHERE "namespaces"."id" IN (
  WITH RECURSIVE "base_and_ancestors" AS (
    (
      SELECT "namespaces"."id", "namespaces"."name", "namespaces"."path", "namespaces"."owner_id", "namespaces"."created_at", "namespaces"."updated_at", "namespaces"."type", "namespaces"."description", "namespaces"."avatar", "namespaces"."membership_lock", "namespaces"."share_with_group_lock", "namespaces"."visibility_level", "namespaces"."request_access_enabled", "namespaces"."ldap_sync_status", "namespaces"."ldap_sync_error", "namespaces"."ldap_sync_last_update_at", "namespaces"."ldap_sync_last_successful_update_at", "namespaces"."ldap_sync_last_sync_at", "namespaces"."description_html", "namespaces"."lfs_enabled", "namespaces"."parent_id", "namespaces"."shared_runners_minutes_limit", "namespaces"."repository_size_limit", "namespaces"."require_two_factor_authentication", "namespaces"."two_factor_grace_period", "namespaces"."cached_markdown_version", "namespaces"."project_creation_level", "namespaces"."runners_token", "namespaces"."file_template_project_id", "namespaces"."saml_discovery_token", "namespaces"."runners_token_encrypted", "namespaces"."custom_project_templates_group_id", "namespaces"."auto_devops_enabled", "namespaces"."extra_shared_runners_minutes_limit", "namespaces"."last_ci_minutes_notification_at", "namespaces"."last_ci_minutes_usage_notification_level", "namespaces"."subgroup_creation_level", "namespaces"."emails_disabled", "namespaces"."max_pages_size", "namespaces"."max_artifacts_size", "namespaces"."mentions_disabled", "namespaces"."default_branch_protection", "namespaces"."unlock_membership_to_ldap", "namespaces"."max_personal_access_token_lifetime", "namespaces"."push_rule_id", "namespaces"."shared_runners_enabled", "namespaces"."allow_descendants_override_disabled_shared_runners", "namespaces"."traversal_ids"
      FROM "namespaces"
      INNER JOIN "projects" ON "projects"."namespace_id" = "namespaces"."id"
      WHERE "namespaces"."type" = 'Group' AND "projects"."id" = 278964
    )
    UNION
    (
      SELECT "namespaces"."id", "namespaces"."name", "namespaces"."path", "namespaces"."owner_id", "namespaces"."created_at", "namespaces"."updated_at", "namespaces"."type", "namespaces"."description", "namespaces"."avatar", "namespaces"."membership_lock", "namespaces"."share_with_group_lock", "namespaces"."visibility_level", "namespaces"."request_access_enabled", "namespaces"."ldap_sync_status", "namespaces"."ldap_sync_error", "namespaces"."ldap_sync_last_update_at", "namespaces"."ldap_sync_last_successful_update_at", "namespaces"."ldap_sync_last_sync_at", "namespaces"."description_html", "namespaces"."lfs_enabled", "namespaces"."parent_id", "namespaces"."shared_runners_minutes_limit", "namespaces"."repository_size_limit", "namespaces"."require_two_factor_authentication", "namespaces"."two_factor_grace_period", "namespaces"."cached_markdown_version", "namespaces"."project_creation_level", "namespaces"."runners_token", "namespaces"."file_template_project_id", "namespaces"."saml_discovery_token", "namespaces"."runners_token_encrypted", "namespaces"."custom_project_templates_group_id", "namespaces"."auto_devops_enabled", "namespaces"."extra_shared_runners_minutes_limit", "namespaces"."last_ci_minutes_notification_at", "namespaces"."last_ci_minutes_usage_notification_level", "namespaces"."subgroup_creation_level", "namespaces"."emails_disabled", "namespaces"."max_pages_size", "namespaces"."max_artifacts_size", "namespaces"."mentions_disabled", "namespaces"."default_branch_protection", "namespaces"."unlock_membership_to_ldap", "namespaces"."max_personal_access_token_lifetime", "namespaces"."push_rule_id", "namespaces"."shared_runners_enabled", "namespaces"."allow_descendants_override_disabled_shared_runners", "namespaces"."traversal_ids"
      FROM "namespaces", "base_and_ancestors"
      WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id"
    )
  ) SELECT id FROM "base_and_ancestors" AS "namespaces"
);

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7848/commands/28040

New SQL Query:
WITH RECURSIVE "base_and_ancestors" AS (
  (
    SELECT "namespaces"."id", "namespaces"."name", "namespaces"."path", "namespaces"."owner_id", "namespaces"."created_at", "namespaces"."updated_at", "namespaces"."type", "namespaces"."description", "namespaces"."avatar", "namespaces"."membership_lock", "namespaces"."share_with_group_lock", "namespaces"."visibility_level", "namespaces"."request_access_enabled", "namespaces"."ldap_sync_status", "namespaces"."ldap_sync_error", "namespaces"."ldap_sync_last_update_at", "namespaces"."ldap_sync_last_successful_update_at", "namespaces"."ldap_sync_last_sync_at", "namespaces"."description_html", "namespaces"."lfs_enabled", "namespaces"."parent_id", "namespaces"."shared_runners_minutes_limit", "namespaces"."repository_size_limit", "namespaces"."require_two_factor_authentication", "namespaces"."two_factor_grace_period", "namespaces"."cached_markdown_version", "namespaces"."project_creation_level", "namespaces"."runners_token", "namespaces"."file_template_project_id", "namespaces"."saml_discovery_token", "namespaces"."runners_token_encrypted", "namespaces"."custom_project_templates_group_id", "namespaces"."auto_devops_enabled", "namespaces"."extra_shared_runners_minutes_limit", "namespaces"."last_ci_minutes_notification_at", "namespaces"."last_ci_minutes_usage_notification_level", "namespaces"."subgroup_creation_level", "namespaces"."emails_disabled", "namespaces"."max_pages_size", "namespaces"."max_artifacts_size", "namespaces"."mentions_disabled", "namespaces"."default_branch_protection", "namespaces"."unlock_membership_to_ldap", "namespaces"."max_personal_access_token_lifetime", "namespaces"."push_rule_id", "namespaces"."shared_runners_enabled", "namespaces"."allow_descendants_override_disabled_shared_runners", "namespaces"."traversal_ids"
    FROM "namespaces"
    INNER JOIN "projects" ON "projects"."namespace_id" = "namespaces"."id"
    WHERE "namespaces"."type" = 'Group' AND "projects"."id" = 278964
  )
  UNION
  (
    SELECT "namespaces"."id", "namespaces"."name", "namespaces"."path", "namespaces"."owner_id", "namespaces"."created_at", "namespaces"."updated_at", "namespaces"."type", "namespaces"."description", "namespaces"."avatar", "namespaces"."membership_lock", "namespaces"."share_with_group_lock", "namespaces"."visibility_level", "namespaces"."request_access_enabled", "namespaces"."ldap_sync_status", "namespaces"."ldap_sync_error", "namespaces"."ldap_sync_last_update_at", "namespaces"."ldap_sync_last_successful_update_at", "namespaces"."ldap_sync_last_sync_at", "namespaces"."description_html", "namespaces"."lfs_enabled", "namespaces"."parent_id", "namespaces"."shared_runners_minutes_limit", "namespaces"."repository_size_limit", "namespaces"."require_two_factor_authentication", "namespaces"."two_factor_grace_period", "namespaces"."cached_markdown_version", "namespaces"."project_creation_level", "namespaces"."runners_token", "namespaces"."file_template_project_id", "namespaces"."saml_discovery_token", "namespaces"."runners_token_encrypted", "namespaces"."custom_project_templates_group_id", "namespaces"."auto_devops_enabled", "namespaces"."extra_shared_runners_minutes_limit", "namespaces"."last_ci_minutes_notification_at", "namespaces"."last_ci_minutes_usage_notification_level", "namespaces"."subgroup_creation_level", "namespaces"."emails_disabled", "namespaces"."max_pages_size", "namespaces"."max_artifacts_size", "namespaces"."mentions_disabled", "namespaces"."default_branch_protection", "namespaces"."unlock_membership_to_ldap", "namespaces"."max_personal_access_token_lifetime", "namespaces"."push_rule_id", "namespaces"."shared_runners_enabled", "namespaces"."allow_descendants_override_disabled_shared_runners", "namespaces"."traversal_ids"
    FROM "namespaces", "base_and_ancestors"
    WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id"
  )
) SELECT "id" FROM "base_and_ancestors" AS "namespaces";

--

SELECT "ci_runners".*
FROM "ci_runners"
INNER JOIN "ci_runner_namespaces" ON "ci_runner_namespaces"."runner_id" = "ci_runners"."id"
WHERE "ci_runner_namespaces"."namespace_id" = 9970;

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7848/commands/28042 https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7848/commands/28044

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Furkan Ayhan

Merge request reports

Loading