Introduce linear root method in UpdateAllMirrorsWorker
What does this MR do and why?
In this MR we're introducing the linear version of the roots
scope for the UpdateAllMirrosWorker
. This changes is behind the feature flag linear_mirrors_worker_roots
.
How to setup and validate locally (strongly suggested)
-
Enable the new method behavior feature flag
Feature.enable(:linear_mirrors_worker_roots)
-
In rails console enable the traversal id feature flag and the linear root version feature flag:
Feature.enable(:use_traversal_ids_for_root_ancestor) Feature.enable(:use_traversal_ids)
-
The setting
should_check_namespace_plan
has to be set to true so::Gitlab::CurrentSettings.should_check_namespace_plan?
will return true. -
Run the following command in the Rails console to trigger the query:
UpdateAllMirrorsWorker.new.perform
SQL queries
The former SQL was:
SELECT "projects".*
FROM "projects"
INNER JOIN project_mirror_data import_state ON import_state.project_id = projects.id
INNER JOIN namespaces AS root_namespaces ON root_namespaces.id =
(WITH RECURSIVE "base_and_ancestors" AS (
(SELECT "namespaces".*
FROM "namespaces"
WHERE (id = projects.namespace_id))
UNION
(SELECT "namespaces".*
FROM "namespaces",
"base_and_ancestors"
WHERE "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "id"
FROM "base_and_ancestors" AS "namespaces"
WHERE "namespaces"."parent_id" IS NULL)
LEFT JOIN gitlab_subscriptions ON gitlab_subscriptions.namespace_id = root_namespaces.id
LEFT JOIN plans ON plans.id = gitlab_subscriptions.hosted_plan_id
WHERE "projects"."archived" = FALSE
AND "projects"."pending_delete" = FALSE
AND "projects"."mirror" = TRUE
AND "import_state"."status" NOT IN ('scheduled',
'started')
AND (import_state.next_execution_timestamp <= '2021-12-14 10:14:38.687048')
AND (import_state.retry_count <= 14)
AND (plans.name IN ('bronze',
'silver',
'premium',
'gold',
'ultimate',
'ultimate_trial',
'premium_trial')
OR projects.visibility_level = 20)
ORDER BY import_state.next_execution_timestamp
LIMIT 60
This is the query plan for the query and the times with warm caches are:
Time: 245.320 ms
- planning: 15.323 ms
- execution: 229.997 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 169248 (~1.30 GiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 252 (~2.00 MiB)
- writes: 0
The new SQL is:
SELECT "projects".*
FROM "projects"
INNER JOIN project_mirror_data import_state ON import_state.project_id = projects.id
INNER JOIN namespaces AS root_namespaces ON root_namespaces.id =
(SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
FROM "namespaces"
WHERE "namespaces"."id" IN
(SELECT DISTINCT "namespaces".traversal_ids[1]
FROM "namespaces"
WHERE (id = projects.namespace_id)))
LEFT JOIN gitlab_subscriptions ON gitlab_subscriptions.namespace_id = root_namespaces.id
LEFT JOIN plans ON plans.id = gitlab_subscriptions.hosted_plan_id
WHERE "projects"."archived" = FALSE
AND "projects"."pending_delete" = FALSE
AND "projects"."mirror" = TRUE
AND "import_state"."status" NOT IN ('scheduled',
'started')
AND (import_state.next_execution_timestamp <= '2021-12-14 10:22:18.636116')
AND (import_state.retry_count <= 14)
AND (plans.name IN ('bronze',
'silver',
'premium',
'gold',
'ultimate',
'ultimate_trial',
'premium_trial')
OR projects.visibility_level = 20)
ORDER BY import_state.next_execution_timestamp
LIMIT 60
This is the query plan and the times with warm caches are:
Time: 230.102 ms
- planning: 13.781 ms
- execution: 216.321 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 191935 (~1.50 GiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #339457 (closed)