Linear traversal version of EE::Namespace#any_project_with_shared_runners_enabled? is slow
The original query using recursive traversal query:
SELECT
1 AS one
FROM
"projects"
WHERE
"projects"."namespace_id" IN ( WITH RECURSIVE "base_and_descendants" AS (
(
SELECT
"namespaces".*
FROM
"namespaces"
WHERE
"namespaces"."type" = 'Group'
AND "namespaces"."id" = 4249178)
UNION (
SELECT
"namespaces".*
FROM
"namespaces",
"base_and_descendants"
WHERE
"namespaces"."type" = 'Group'
AND "namespaces"."parent_id" = "base_and_descendants"."id"))
SELECT
id
FROM
"base_and_descendants" AS "namespaces")
AND "projects"."shared_runners_enabled" = TRUE
LIMIT 1
Timings and plan from database-lab
: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4104/commands/14337
The query produced using linear traversal queries
SELECT
1 AS one
FROM
"projects"
WHERE
"projects"."namespace_id" IN (
SELECT
"namespaces"."id"
FROM
"namespaces"
WHERE (traversal_ids @> ('{4249178}')))
AND "projects"."shared_runners_enabled" = TRUE
LIMIT 1;
Timings and plan from database-lab
: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/4104/commands/14336
The linear version seems to be performing better than the recursive one. However, we can see this query running for over 10 seconds in production (eg. check correclation_id
: 01F61XDEZHKWNPYEFVV7892RHH
).
When I run EXPLAIN ANALYZE
on the read-only replica, it gives me a different query plan:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.87..102.50 rows=1 width=4) (actual time=11363.502..11363.504 rows=0 loops=1)
-> Nested Loop (cost=0.87..975117.81 rows=9595 width=4) (actual time=11363.501..11363.502 rows=0 loops=1)
-> Index Scan using namespaces_pkey on namespaces (cost=0.43..808139.83 rows=5755 width=4) (actual time=2248.267..11292.270 rows=5524 loops=1)
Filter: (traversal_ids @> '{4249178}'::integer[])
Rows Removed by Filter: 10921772
-> Index Scan using index_projects_on_namespace_id_and_id on projects (cost=0.44..28.82 rows=19 width=4) (actual time=0.012..0.012 rows=0 loops=5524)
Index Cond: (namespace_id = namespaces.id)
Filter: shared_runners_enabled
Rows Removed by Filter: 6
Planning Time: 1.478 ms
Execution Time: 11363.576 ms
The difference is that it uses namespaces_pkey
index instead of index_namespaces_on_traversal_ids
.
Edited by Imre Farkas