Show only personal snippets on explore page
What does this MR do?
This improves the performance of the "Explore snippets" page by excluding project snippets from the query.
Ref: #30877 (closed), follow-up to gitlab-foss!32576 (merged)
Query plans
Plans for the query change itself, taken from staging:
Before
gitlabhq_production=> EXPLAIN ANALYZE
SELECT "snippets".*
FROM
(SELECT "snippets".*
FROM "snippets"
WHERE (snippets.visibility_level IN (0,
10,
20)
OR snippets.author_id = 1)
AND "snippets"."project_id" IS NULL
UNION SELECT "snippets".*
FROM "snippets"
INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id"
INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id"
WHERE (snippets.visibility_level IN (0,
10,
20)
OR snippets.author_id = 1)
AND (projects.visibility_level IN (0,
10,
20))
AND "project_features"."snippets_access_level" IN (20,
30)
UNION SELECT "snippets".*
FROM "snippets"
INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id"
INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id"
WHERE "project_features". "snippets_access_level" IN (20,
30,
10)
AND (EXISTS
(SELECT 1
FROM "project_authorizations"
WHERE (project_id = snippets.project_id)
AND "project_authorizations"."user_id" = 1))) snippets
ORDER BY created_at DESC
LIMIT 10
OFFSET 0;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------
Limit (cost=103221.02..103221.04 rows=10 width=1744) (actual time=4570.708..4570.727 rows=10 loops=1)
-> Sort (cost=103221.02..103357.13 rows=54443 width=1744) (actual time=4570.706..4570.709 rows=10 loops=1)
Sort Key: snippets.created_at DESC
Sort Method: top-N heapsort Memory: 30kB
-> HashAggregate (cost=100955.66..101500.09 rows=54443 width=1744) (actual time=4496.468..4542.177 rows=46267 loops=1)
Group Key: snippets.id, snippets.title, snippets.content, snippets.author_id, snippets.project_id, snippets.created_at, snippets.updated_at, sni
ppets.file_name, snippets.type, snippets.visibility_level, snippets.title_html, snippets.content_html, snippets.cached_markdown_version, snippets.description,
snippets.description_html
-> Append (cost=0.00..98914.05 rows=54443 width=1744) (actual time=0.013..2410.551 rows=46521 loops=1)
-> Seq Scan on snippets (cost=0.00..7082.38 rows=37556 width=1126) (actual time=0.012..31.345 rows=37610 loops=1)
Filter: ((project_id IS NULL) AND ((visibility_level = ANY ('{0,10,20}'::integer[])) OR (author_id = 1)))
Rows Removed by Filter: 13600
-> Nested Loop (cost=0.86..90297.65 rows=16884 width=1126) (actual time=0.109..2294.916 rows=8653 loops=1)
Join Filter: (snippets_1.project_id = projects.id)
-> Nested Loop (cost=0.43..86908.90 rows=5015 width=1130) (actual time=0.086..1509.970 rows=8653 loops=1)
-> Seq Scan on snippets snippets_1 (cost=0.00..7082.38 rows=51162 width=1126) (actual time=0.029..33.603 rows=51210 loops=1)
Filter: ((visibility_level = ANY ('{0,10,20}'::integer[])) OR (author_id = 1))
-> Index Scan using index_project_features_on_project_id on project_features (cost=0.43..1.55 rows=1 width=4) (actual time=0
.028..0.029 rows=0 loops=51210)
Index Cond: (project_id = snippets_1.project_id)
Filter: (snippets_access_level = ANY ('{20,30}'::integer[]))
Rows Removed by Filter: 0
-> Index Scan using projects_pkey on projects (cost=0.43..0.66 rows=1 width=4) (actual time=0.089..0.090 rows=1 loops=8653)
Index Cond: (id = project_features.project_id)
Filter: (visibility_level = ANY ('{0,10,20}'::integer[]))
-> Nested Loop (cost=14.38..989.59 rows=3 width=1126) (actual time=27.655..79.280 rows=258 loops=1)
-> Nested Loop (cost=13.95..986.22 rows=7 width=1134) (actual time=27.626..78.134 rows=258 loops=1)
-> Nested Loop (cost=13.66..841.75 rows=415 width=8) (actual time=0.294..45.374 rows=353 loops=1)
-> HashAggregate (cost=13.23..17.38 rows=415 width=4) (actual time=0.268..0.511 rows=353 loops=1)
Group Key: project_authorizations.project_id
-> Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on project_authorization
s (cost=0.43..12.19 rows=415 width=4) (actual time=0.093..0.192 rows=353 loops=1)
Index Cond: (user_id = 1)
Heap Fetches: 2
-> Index Only Scan using projects_pkey on projects projects_1 (cost=0.43..1.98 rows=1 width=4) (actual time=0.125..0.1
26 rows=1 loops=353)
Index Cond: (id = project_authorizations.project_id)
Heap Fetches: 246
-> Index Scan using index_snippets_on_project_id on snippets snippets_2 (cost=0.29..0.33 rows=2 width=1126) (actual time=0.0
89..0.092 rows=1 loops=353)
Index Cond: (project_id = projects_1.id)
-> Index Scan using index_project_features_on_project_id on project_features project_features_1 (cost=0.43..0.47 rows=1 width=4) (
actual time=0.003..0.004 rows=1 loops=258)
Index Cond: (project_id = projects_1.id)
Filter: (snippets_access_level = ANY ('{20,30,10}'::integer[]))
Planning time: 13.568 ms
Execution time: 4571.079 ms
After
gitlabhq_production=> EXPLAIN ANALYZE
gitlabhq_production-> SELECT "snippets".*
gitlabhq_production-> FROM "snippets"
gitlabhq_production-> WHERE snippets.visibility_level IN (0,
gitlabhq_production(> 10,
gitlabhq_production(> 20)
gitlabhq_production-> AND "snippets"."project_id" IS NULL
gitlabhq_production-> ORDER BY created_at DESC
gitlabhq_production-> LIMIT 10
gitlabhq_production-> OFFSET 0;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=7768.30..7768.33 rows=10 width=1123) (actual time=258.436..258.442 rows=10 loops=1)
-> Sort (cost=7768.30..7862.36 rows=37622 width=1123) (actual time=258.434..258.435 rows=10 loops=1)
Sort Key: created_at DESC
Sort Method: top-N heapsort Memory: 30kB
-> Seq Scan on snippets (cost=0.00..6955.30 rows=37622 width=1123) (actual time=1.254..251.282 rows=37719 loops=1)
Filter: ((project_id IS NULL) AND (visibility_level = ANY ('{0,10,20}'::integer[])))
Rows Removed by Filter: 13600
Planning time: 3.704 ms
Execution time: 258.512 ms
(9 rows)
Plans for the index changes, taken from #database-lab:
Before
explain SELECT "snippets".* FROM "snippets" WHERE snippets.visibility_level IN (0, 10, 20) AND "snippets"."project_id" IS NULL ORDER BY created_at DESC LIMIT 10 OFFSET 0;
Limit (cost=72914.70..72914.72 rows=10 width=1083) (actual time=1256.131..1256.141 rows=10 loops=1)
Buffers: shared dirtied=1409 hit=64120 read=29583
-> Sort (cost=72914.70..73276.58 rows=144754 width=1083) (actual time=1256.128..1256.135 rows=10 loops=1)
Sort Key: snippets.created_at DESC
Sort Method: top-N heapsort Memory: 39kB
Buffers: shared dirtied=1409 hit=64120 read=29583
-> Index Scan using index_snippets_on_project_id on public.snippets (cost=0.42..69786.62 rows=144754 width=1083) (actual time=0.458..1201.158 rows=144755 loops=1)
Index Cond: (snippets.project_id IS NULL)
Filter: (snippets.visibility_level = ANY ('{0,10,20}'::integer[]))
Rows Removed by Filter: 0
Buffers: shared dirtied=1409 hit=64117 read=29583
Cost: 73276.58
Time: 1.259 s
- planning: 2.848 ms
- execution: 1.256 s
- I/O read: 968.999 ms
Shared buffers:
- hits: 64120 (~500.90 MiB) from the buffer pool
- reads: 29583 (~231.10 MiB) from the OS file cache, including disk I/O
- dirtied: 1409 (~11.00 MiB)
- writes: 0
After (filtering on both project_id
and visibility_level
)
explain SELECT "snippets".* FROM "snippets" WHERE snippets.visibility_level IN (0, 10, 20) AND "snippets"."project_id" IS NULL ORDER BY created_at DESC LIMIT 10 OFFSET 0;
Limit (cost=68674.70..68674.72 rows=10 width=1083) (actual time=179.753..179.762 rows=10 loops=1)
Buffers: shared hit=56452 read=558
-> Sort (cost=68674.70..69036.58 rows=144755 width=1083) (actual time=179.750..179.756 rows=10 loops=1)
Sort Key: snippets.created_at DESC
Sort Method: top-N heapsort Memory: 39kB
Buffers: shared hit=56452 read=558
-> Index Scan using index_snippets_on_project_id_and_visibility_level on public.snippets (cost=0.42..65546.59 rows=144755 width=1083) (actual time=0.167..123.456 rows=144755 loops=1)
Index Cond: ((snippets.project_id IS NULL) AND (snippets.visibility_level = ANY ('{0,10,20}'::integer[])))
Buffers: shared hit=56449 read=558
Cost: 69036.58
Time: 186.205 ms
- planning: 6.386 ms
- execution: 179.819 ms
- I/O read: 9.709 ms
Shared buffers:
- hits: 56452 (~441.00 MiB) from the buffer pool
- reads: 558 (~4.40 MiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
After (filtering only on project_id
)
explain SELECT "snippets".* FROM "snippets" WHERE "snippets"."project_id" IS NULL ORDER BY created_at DESC LIMIT 10 OFFSET 0;
Limit (cost=68573.96..68573.98 rows=10 width=1083) (actual time=160.831..160.840 rows=10 loops=1)
Buffers: shared hit=57004
-> Sort (cost=68573.96..68935.85 rows=144755 width=1083) (actual time=160.829..160.834 rows=10 loops=1)
Sort Key: snippets.created_at DESC
Sort Method: top-N heapsort Memory: 39kB
Buffers: shared hit=57004
-> Index Scan using index_snippets_on_project_id_and_visibility_level on public.snippets (cost=0.42..65445.86 rows=144755 width=1083) (actual time=0.039..103.798 rows=144755 loops=1)
Index Cond: (snippets.project_id IS NULL)
Buffers: shared hit=57001
Cost: 68935.85
Time: 163.925 ms
- planning: 3.042 ms
- execution: 160.883 ms
Shared buffers:
- hits: 57004 (~445.30 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation created/updated or follow-up review issue created
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Performance and Testing
Edited by 🤖 GitLab Bot 🤖