Speed up UpdateAllMirrorsWorker query on GitLab.com
What does this MR do?
Speed up UpdateAllMirrorsWorker query on GitLab.com
On GitLab.com, we check for the license on a project-by-project basis, as different namespaces can have different licenses. Pull mirroring is a Bronze feature, meaning it should only be available to:
- Paid projects.
- Public projects.
Previously, we offered pull mirroring for free even on private projects. After we stopped offering that, we started checking the plans in the query to fetch mirrors, to avoid fetching lots of projects we were going to do nothing with.
However, this query got slower over time, because we order by
next_execution_timestamp
, and next_execution_timestamp
is stuck at
some point before 2020-03-28 for these free private mirrors until the
project is made public, or paid for. That means that the database has to
do a large amount of work discarding those projects that we know we will
never process.
For GitLab.com ONLY, we work around this by explicitly filtering on
next_execution_timestamp > '2020-03-28'
. This makes the query go from
taking several seconds, to around 150 milliseconds.
In the long term, we should look to remove this workaround and simplify the query: #216783 (closed)
For #216252 (closed).
Query timings
Note that because we're constantly processing mirrors on GitLab.com, you'll never get the same results from two executions of the same query. First, let's check in a Ruby console:
# Current
UpdateAllMirrorsWorker.new.send(:pull_mirrors_batch, freeze_at: Time.now, batch_size: 500, offset_at: nil).count
# D, [2020-05-06T13:06:40.714017 #30186] DEBUG -- : (2453.9ms) SELECT COUNT(*) FROM (SELECT 1 AS one FROM "projects" INNER JOIN project_mirror_data import_state ON import_state.project_id = projects.id INNER JOIN namespaces AS root_namespaces ON root_namespaces.id = (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" WHERE (id = projects.namespace_id))
# UNION
# (SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "id" FROM "base_and_ancestors" AS "namespaces" WHERE "namespaces"."parent_id" IS NULL) LEFT JOIN gitlab_subscriptions ON gitlab_subscriptions.namespace_id = root_namespaces.id LEFT JOIN plans ON plans.id = gitlab_subscriptions.hosted_plan_id WHERE "projects"."archived" = FALSE AND "projects"."mirror" = TRUE AND "import_state"."status" NOT IN ('scheduled', 'started') AND (import_state.next_execution_timestamp <= '2020-05-06 13:06:38.238744') AND (import_state.retry_count <= 14) AND (plans.name IN ('bronze','silver','gold','early_adopter') OR projects.visibility_level = 20) ORDER BY import_state.next_execution_timestamp LIMIT 500) subquery_for_count /*application:console*/
# => 282
# New
UpdateAllMirrorsWorker.new.send(:pull_mirrors_batch, freeze_at: Time.now, batch_size: 500, offset_at: Time.utc(2020, 3, 28)).count
# D, [2020-05-06T13:07:29.339924 #30186] DEBUG -- : (135.4ms) SELECT COUNT(*) FROM (SELECT 1 AS one FROM "projects" INNER JOIN project_mirror_data import_state ON import_state.project_id = projects.id INNER JOIN namespaces AS root_namespaces ON root_namespaces.id = (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" WHERE (id = projects.namespace_id))
# UNION
# (SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "id" FROM "base_and_ancestors" AS "namespaces" WHERE "namespaces"."parent_id" IS NULL) LEFT JOIN gitlab_subscriptions ON gitlab_subscriptions.namespace_id = root_namespaces.id LEFT JOIN plans ON plans.id = gitlab_subscriptions.hosted_plan_id WHERE "projects"."archived" = FALSE AND "projects"."mirror" = TRUE AND "import_state"."status" NOT IN ('scheduled', 'started') AND (import_state.next_execution_timestamp <= '2020-05-06 13:07:29.198020') AND (import_state.retry_count <= 14) AND (import_state.next_execution_timestamp > '2020-03-28 00:00:00') AND (plans.name IN ('bronze','silver','gold','early_adopter') OR projects.visibility_level = 20) ORDER BY import_state.next_execution_timestamp LIMIT 500) subquery_for_count /*application:console*/
# => 276
And query plans:
Before: https://explain.depesz.com/s/18cr
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=357.35..445148.74 rows=500 width=758) (actual time=543.520..2497.460 rows=498 loops=1)
-> Nested Loop Left Join (cost=357.35..1197735.79 rows=1346 width=758) (actual time=543.518..2497.387 rows=498 loops=1)
Join Filter: (plans.id = gitlab_subscriptions.hosted_plan_id)
Rows Removed by Join Filter: 275420
Filter: (((plans.name)::text = ANY ('{bronze,silver,gold,early_adopter}'::text[])) OR (projects.visibility_level = 20))
Rows Removed by Filter: 54582
-> Nested Loop Left Join (cost=357.35..1197475.62 rows=1897 width=762) (actual time=231.271..2416.670 rows=55080 loops=1)
-> Nested Loop (cost=356.91..1196582.11 rows=1897 width=762) (actual time=231.258..2000.251 rows=55080 loops=1)
-> Nested Loop (cost=0.99..520604.17 rows=1897 width=758) (actual time=231.121..949.211 rows=55080 loops=1)
-> Index Scan using index_mirror_data_on_next_execution_and_retry_count on project_mirror_data import_state (cost=0.56..152838.83 rows=148151 width=12) (actual time=0.587..263.207 rows=75641 loops=1)
Index Cond: ((next_execution_timestamp <= now()) AND (retry_count <= 14))
Filter: ((status)::text <> ALL ('{scheduled,started}'::text[]))
Rows Removed by Filter: 98
-> Index Scan using projects_pkey on projects (cost=0.43..2.47 rows=1 width=750) (actual time=0.008..0.009 rows=1 loops=75641)
Index Cond: (id = import_state.project_id)
Filter: ((NOT archived) AND mirror)
Rows Removed by Filter: 0
-> Index Only Scan using namespaces_pkey on namespaces root_namespaces (cost=355.92..356.33 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=55080)
Index Cond: (id = (SubPlan 2))
Heap Fetches: 4028
SubPlan 2
-> CTE Scan on base_and_ancestors namespaces_2 (cost=353.47..355.49 rows=1 width=4) (actual time=0.015..0.016 rows=1 loops=55080)
Filter: (parent_id IS NULL)
Rows Removed by Filter: 0
CTE base_and_ancestors
-> Recursive Union (cost=0.43..353.47 rows=101 width=334) (actual time=0.010..0.014 rows=1 loops=55080)
-> Index Scan using namespaces_pkey on namespaces (cost=0.43..3.45 rows=1 width=334) (actual time=0.008..0.008 rows=1 loops=55080)
Index Cond: (id = projects.namespace_id)
-> Nested Loop (cost=0.43..34.80 rows=10 width=334) (actual time=0.002..0.002 rows=0 loops=70469)
-> WorkTable Scan on base_and_ancestors (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.000 rows=1 loops=70469)
-> Index Scan using namespaces_pkey on namespaces namespaces_1 (cost=0.43..3.45 rows=1 width=334) (actual time=0.002..0.002 rows=0 loops=70469)
Index Cond: (id = base_and_ancestors.parent_id)
-> Index Scan using index_gitlab_subscriptions_on_namespace_id on gitlab_subscriptions (cost=0.43..0.46 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=55080)
Index Cond: (namespace_id = root_namespaces.id)
-> Materialize (cost=0.00..4.09 rows=6 width=11) (actual time=0.000..0.000 rows=6 loops=55080)
-> Seq Scan on plans (cost=0.00..4.06 rows=6 width=11) (actual time=0.007..0.008 rows=6 loops=1)
Planning time: 9.597 ms
Execution time: 2498.059 ms
(38 rows)
After: https://explain.depesz.com/s/D69X
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=357.35..76740.62 rows=76 width=758) (actual time=39.528..79.398 rows=229 loops=1)
-> Nested Loop Left Join (cost=357.35..76740.62 rows=76 width=758) (actual time=39.528..79.364 rows=229 loops=1)
Join Filter: (plans.id = gitlab_subscriptions.hosted_plan_id)
Rows Removed by Join Filter: 3428
Filter: (((plans.name)::text = ANY ('{bronze,silver,gold,early_adopter}'::text[])) OR (projects.visibility_level = 20))
Rows Removed by Filter: 455
-> Nested Loop Left Join (cost=357.35..76722.10 rows=107 width=762) (actual time=0.283..78.337 rows=684 loops=1)
-> Nested Loop (cost=356.91..76671.70 rows=107 width=762) (actual time=0.255..63.073 rows=684 loops=1)
-> Nested Loop (cost=0.99..38543.27 rows=107 width=758) (actual time=0.198..48.306 rows=684 loops=1)
-> Index Scan using index_mirror_data_on_next_execution_and_retry_count on project_mirror_data import_state (cost=0.56..10472.44 rows=8338 width=12) (actual time=0.042..30.453 rows=2203 loops=1)
Index Cond: ((next_execution_timestamp <= now()) AND (next_execution_timestamp > '2020-03-28 00:00:00+00'::timestamp with time zone) AND (retry_count <= 14))
Filter: ((status)::text <> ALL ('{scheduled,started}'::text[]))
Rows Removed by Filter: 29
-> Index Scan using projects_pkey on projects (cost=0.43..3.36 rows=1 width=750) (actual time=0.008..0.008 rows=0 loops=2203)
Index Cond: (id = import_state.project_id)
Filter: ((NOT archived) AND mirror)
Rows Removed by Filter: 1
-> Index Only Scan using namespaces_pkey on namespaces root_namespaces (cost=355.92..356.33 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=684)
Index Cond: (id = (SubPlan 2))
Heap Fetches: 146
SubPlan 2
-> CTE Scan on base_and_ancestors namespaces_2 (cost=353.47..355.49 rows=1 width=4) (actual time=0.017..0.018 rows=1 loops=684)
Filter: (parent_id IS NULL)
Rows Removed by Filter: 1
CTE base_and_ancestors
-> Recursive Union (cost=0.43..353.47 rows=101 width=334) (actual time=0.010..0.016 rows=2 loops=684)
-> Index Scan using namespaces_pkey on namespaces (cost=0.43..3.45 rows=1 width=334) (actual time=0.008..0.008 rows=1 loops=684)
Index Cond: (id = projects.namespace_id)
-> Nested Loop (cost=0.43..34.80 rows=10 width=334) (actual time=0.002..0.003 rows=0 loops=1089)
-> WorkTable Scan on base_and_ancestors (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.000 rows=1 loops=1089)
-> Index Scan using namespaces_pkey on namespaces namespaces_1 (cost=0.43..3.45 rows=1 width=334) (actual time=0.002..0.002 rows=0 loops=1089)
Index Cond: (id = base_and_ancestors.parent_id)
-> Index Scan using index_gitlab_subscriptions_on_namespace_id on gitlab_subscriptions (cost=0.43..0.46 rows=1 width=8) (actual time=0.021..0.022 rows=1 loops=684)
Index Cond: (namespace_id = root_namespaces.id)
-> Materialize (cost=0.00..4.09 rows=6 width=11) (actual time=0.000..0.000 rows=6 loops=684)
-> Seq Scan on plans (cost=0.00..4.06 rows=6 width=11) (actual time=0.006..0.008 rows=6 loops=1)
Planning time: 9.975 ms
Execution time: 79.867 ms
(38 rows)
The key note from the query plans is the number of loops in 'CTE base_and_ancestors' - #216252 (comment 334514544) demonstrates why this is dropping so much.