Add indexes on deployments to improve environments search
To load diffs and other files, we make two SELECT calls in
MergeRequest#environments_for
: one for the source branch, and one for
the target branch.
The source branch query makes a subquery:
SELECT environment_id FROM deployments WHERE project_id = 123 AND ref = 'source_branch'
The target branch makes this subquery:
SELECT environment_id FROM deployments WHERE project_id = 123 AND (ref = 'master' OR tag IS TRUE)
Both subqueries were taking over 250 ms to execute. When we add an index
for the deployments table on (projects_id, ref)
, this execution drops
down to under 5 ms since the table usually has a small number of rows
for the source branch.
Adding a partial index on projects_id WHERE tag IS TRUE
allows PostgreSQL to perform two
separate indexed scans: one for the ref, and one for the tag
condition. However, this query still takes 60 ms to run for the
gitlab-org/gitlab project because master
has 10,000+ rows. We should
consider adding a LIMIT in a future iteration.
Closes #55353 (closed)
Details
Source query
SELECT
"environments".*
FROM
"environments"
WHERE
"environments"."project_id" = 278964
AND ("environments"."state" IN ('available'))
AND "environments"."id" IN (
SELECT
"deployments"."environment_id"
FROM
"deployments"
WHERE
"deployments"."project_id" = 278964
AND (ref = 'sh-optimize-commit-is-ancestor-env')
GROUP BY
"deployments"."environment_id"
)
ORDER BY
(
SELECT
MAX("deployments"."id")
FROM
"deployments"
WHERE
"deployments"."environment_id" = "environments"."id"
) ASC NULLS FIRST
Query plan:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=139519.61..139519.61 rows=1 width=148) (actual time=127.854..127.854 rows=0 loops=1)
Sort Key: ((SubPlan 2)) NULLS FIRST
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=139515.94..139519.60 rows=1 width=148) (actual time=127.811..127.811 rows=0 loops=1)
-> Group (cost=139515.51..139515.52 rows=1 width=4) (actual time=127.811..127.811 rows=0 loops=1)
Group Key: deployments.environment_id
-> Sort (cost=139515.51..139515.52 rows=1 width=4) (actual time=127.810..127.810 rows=0 loops=1)
Sort Key: deployments.environment_id
Sort Method: quicksort Memory: 25kB
-> Index Scan using index_deployments_on_project_id_and_id on deployments (cost=0.56..139515.50 rows=1 width=4) (actual time=127.806..127.806 rows=0 loops=1)
Index Cond: (project_id = 278964)
Filter: ((ref)::text = 'sh-optimize-commit-is-ancestor-env'::text)
Rows Removed by Filter: 104862
-> Index Scan using environments_pkey on environments (cost=0.42..3.45 rows=1 width=144) (never executed)
Index Cond: (id = deployments.environment_id)
Filter: ((project_id = 278964) AND ((state)::text = 'available'::text))
SubPlan 2
-> Result (cost=0.60..0.61 rows=1 width=4) (never executed)
InitPlan 1 (returns $1)
-> Limit (cost=0.56..0.60 rows=1 width=4) (never executed)
-> Index Only Scan Backward using index_deployments_on_environment_id_and_id on deployments deployments_1 (cost=0.56..12.15 rows=354 width=4) (never executed)
Index Cond: ((environment_id = environments.id) AND (id IS NOT NULL))
Heap Fetches: 0
Planning time: 0.496 ms
Execution time: 127.912 ms
(25 rows)
After
Sort (cost=9.75..9.75 rows=1 width=140) (actual time=0.403..0.403 rows=0 loops=1)
Sort Key: ((SubPlan 2)) NULLS FIRST
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=3 read=4
I/O Timings: read=0.231
-> Nested Loop (cost=5.02..9.74 rows=1 width=140) (actual time=0.329..0.329 rows=0 loops=1)
Buffers: shared read=4
I/O Timings: read=0.231
-> Group (cost=4.59..4.60 rows=1 width=4) (actual time=0.328..0.328 rows=0 loops=1)
Group Key: deployments.environment_id
Buffers: shared read=4
I/O Timings: read=0.231
-> Sort (cost=4.59..4.60 rows=1 width=4) (actual time=0.327..0.327 rows=0 loops=1)
Sort Key: deployments.environment_id
Sort Method: quicksort Memory: 25kB
Buffers: shared read=4
I/O Timings: read=0.231
-> Index Scan using deployments_test_index on public.deployments (cost=0.56..4.58 rows=1 width=4) (actual time=0.320..0.320 rows=0 loops=1)
Index Cond: ((deployments.project_id = 278964) AND ((deployments.ref)::text = 'sh-optimize-commit-is-ancestor-env'::text))
Buffers: shared read=4
I/O Timings: read=0.231
-> Index Scan using environments_pkey on public.environments (cost=0.42..4.45 rows=1 width=136) (actual time=0.000..0.000 rows=0 loops=0)
Index Cond: (environments.id = deployments.environment_id)
Filter: ((environments.project_id = 278964) AND ((environments.state)::text = 'available'::text))
Rows Removed by Filter: 0
SubPlan 2
-> Result (cost=0.66..0.67 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=0)
InitPlan 1 (returns $1)
-> Limit (cost=0.56..0.66 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=0)
-> Index Only Scan using index_deployments_on_environment_id_and_id on public.deployments deployments_1 (cost=0.56..33.13 rows=349 width=4) (actual time=0.000..0.000 rows=0 loops=0)
Index Cond: ((deployments_1.environment_id = environments.id) AND (deployments_1.id IS NOT NULL))
Heap Fetches: 0
Summary:
Time: 3.378 ms
- planning: 2.879 ms
- execution: 0.499 ms
- I/O read: 0.231 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 3 (~24.00 KiB) from the buffer pool
- reads: 4 (~32.00 KiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Target query
SELECT
"environments".*
FROM
"environments"
WHERE
"environments"."project_id" = 278964
AND ("environments"."state" IN ('available'))
AND "environments"."id" IN (
SELECT
"deployments"."environment_id"
FROM
"deployments"
WHERE
"deployments"."project_id" = 278964
AND (
ref = 'master'
OR tag IS TRUE
)
GROUP BY
"deployments"."environment_id"
)
ORDER BY
(
SELECT
MAX("deployments"."id")
FROM
"deployments"
WHERE
"deployments"."environment_id" = "environments"."id"
) ASC NULLS FIRST
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=150679.28..150688.41 rows=3653 width=148) (actual time=165.995..165.995 rows=5 loops=1)
Sort Key: ((SubPlan 2)) NULLS FIRST
Sort Method: quicksort Memory: 25kB
-> Hash Join (cost=140565.26..150463.12 rows=3653 width=148) (actual time=142.417..165.980 rows=5 loops=1)
Hash Cond: (environments.id = deployments.environment_id)
-> Index Scan using index_environments_on_project_id_state_environment_type on environments (cost=0.42..7648.78 rows=7306 width=144) (actual time=0.070..26.446 rows=10295 loops=1)
Index Cond: ((project_id = 278964) AND ((state)::text = 'available'::text))
-> Hash (cost=140194.15..140194.15 rows=29655 width=4) (actual time=137.117..137.117 rows=106 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 260kB
-> HashAggregate (cost=139601.05..139897.60 rows=29655 width=4) (actual time=136.995..137.071 rows=106 loops=1)
Group Key: deployments.environment_id
-> Index Scan using index_deployments_on_project_id_and_id on deployments (cost=0.56..139515.50 rows=34219 width=4) (actual time=0.033..135.166 rows=10448 loops=1)
Index Cond: (project_id = 278964)
Filter: (((ref)::text = 'master'::text) OR (tag IS TRUE))
Rows Removed by Filter: 94414
SubPlan 2
-> Result (cost=0.60..0.61 rows=1 width=4) (actual time=0.032..0.032 rows=1 loops=5)
InitPlan 1 (returns $1)
-> Limit (cost=0.56..0.60 rows=1 width=4) (actual time=0.030..0.030 rows=1 loops=5)
-> Index Only Scan Backward using index_deployments_on_environment_id_and_id on deployments deployments_1 (cost=0.56..12.15 rows=354 width=4) (actual time=0.028..0.028 rows=1 loops=5)
Index Cond: ((environment_id = environments.id) AND (id IS NOT NULL))
Heap Fetches: 1
Planning time: 0.553 ms
Execution time: 166.281 ms
(24 rows)
After
Sort (cost=97786.44..97795.38 rows=3573 width=148) (actual time=60.560..60.561 rows=5 loops=1)
Sort Key: ((SubPlan 2)) NULLS FIRST
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=17484
-> Hash Join (cost=85261.68..97575.59 rows=3573 width=148) (actual time=35.476..60.514 rows=5 loops=1)
Hash Cond: (environments.id = deployments.environment_id)
Buffers: shared hit=17481
-> Index Scan using index_environments_on_project_id_state_environment_type on public.environments (cost=0.42..9935.40 rows=7146 width=144) (actual time=0.033..24.525 rows=10257 loops=1)
Index Cond: ((environments.project_id = 278964) AND ((environments.state)::text = 'available'::text))
Buffers: shared hit=7474
-> Hash (cost=84889.06..84889.06 rows=29776 width=4) (actual time=32.444..32.444 rows=106 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 260kB
Buffers: shared hit=9983
-> HashAggregate (cost=84293.54..84591.30 rows=29776 width=4) (actual time=32.297..32.392 rows=106 loops=1)
Group Key: deployments.environment_id
Buffers: shared hit=9983
-> Bitmap Heap Scan on public.deployments (cost=687.51..84207.58 rows=34382 width=4) (actual time=3.896..28.439 rows=10457 loops=1)
Buffers: shared hit=9983
-> BitmapOr (cost=687.51..687.51 rows=35284 width=0) (actual time=2.168..2.168 rows=0 loops=1)
Buffers: shared hit=47
-> Bitmap Index Scan using i1 (cost=0.00..629.34 rows=32277 width=0) (actual time=2.161..2.161 rows=10457 loops=1)
Index Cond: ((deployments.project_id = 278964) AND ((deployments.ref)::text = 'master'::text))
Buffers: shared hit=44
-> Bitmap Index Scan using i2 (cost=0.00..40.98 rows=3007 width=0) (actual time=0.007..0.007 rows=0 loops=1)
Index Cond: (deployments.project_id = 278964)
Buffers: shared hit=3
SubPlan 2
-> Result (cost=0.65..0.66 rows=1 width=4) (actual time=0.040..0.040 rows=1 loops=5)
Buffers: shared hit=24
InitPlan 1 (returns $1)
-> Limit (cost=0.56..0.65 rows=1 width=4) (actual time=0.037..0.038 rows=1 loops=5)
Buffers: shared hit=24
-> Index Only Scan using index_deployments_on_environment_id_and_id on public.deployments deployments_1 (cost=0.56..30.01 rows=356 width=4) (actual time=0.033..0.033 rows=1 loops=5)
Index Cond: ((deployments_1.environment_id = environments.id) AND (deployments_1.id IS NOT NULL))
Heap Fetches: 1
Buffers: shared hit=24
Time: 64.050 ms
- planning: 3.077 ms
- execution: 60.973 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 17484 (~136.60 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0