Geo: Improve performance of LFS objects queries
What does this MR do?
Removes a CTE which was written to improve performance of cross-database queries using FDW. Since we rearchitected the backend to no longer perform those particular cross-database queries, we should remove this CTE to:
- give PostgreSQL the opportunity to optimize the whole query
- improve readability and maintainability
Resolves #250230 (closed)
Queries
Query timing changes in database-lab, each query run twice:
- Selective sync by namespace, local files only: 14s, 14s => 15s, 0.2s
- Selective sync by shard, local files only: 21s, 20s => 0.1s, 0.1s
- Selective sync by namespace: 96s, 19s => 26s, 0.2s
- Selective sync by shard: 55s, 23s => 4s, 0.1s
Before - Queries and explains
Before - Selective sync by namespace, local files only
EXPLAIN WITH "restricted_lfs_objects" AS (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."namespace_id" IN (
WITH RECURSIVE "base_and_descendants" AS (
(
SELECT
9970 AS id
)
UNION
(
SELECT
"namespaces"."id"
FROM
"namespaces",
"base_and_descendants"
WHERE
"namespaces"."parent_id" = "base_and_descendants"."id"
)
)
SELECT
"id"
FROM
"base_and_descendants" AS "namespaces"
)
)
)
SELECT
"lfs_objects"."id"
FROM
"restricted_lfs_objects"
INNER JOIN "lfs_objects" ON "restricted_lfs_objects"."lfs_object_id" = "lfs_objects"."id"
WHERE
"lfs_objects"."id" BETWEEN 1
AND 1000
AND "lfs_objects"."file_store" = 1;
Hash Join (cost=2905395.04..2928003.77 rows=1 width=4) (actual time=14058.037..14058.037 rows=0 loops=1)
Hash Cond: (restricted_lfs_objects.lfs_object_id = lfs_objects.id)
Buffers: shared hit=1687381 read=10
I/O Timings: read=1.477
CTE restricted_lfs_objects
-> HashAggregate (cost=2895388.18..2905380.99 rows=999281 width=4) (actual time=14057.119..14057.119 rows=1 loops=1)
Group Key: lfs_objects_projects.lfs_object_id
Buffers: shared hit=1687183 read=10
I/O Timings: read=1.477
-> Merge Semi Join (cost=1105422.69..2752844.83 rows=57017340 width=4) (actual time=7.703..14048.269 rows=15710 loops=1)
Merge Cond: (lfs_objects_projects.project_id = projects.id)
Buffers: shared hit=1687183 read=10
I/O Timings: read=1.477
-> Index Only Scan using index_lfs_objects_projects_on_project_id_and_lfs_object_id on public.lfs_objects_projects (cost=0.56..1180621.80 rows=57017340 width=8) (actual time=0.044..8848.007 rows=57097896 loops=1)
Heap Fetches: 914856
Buffers: shared hit=1684705
-> Sort (cost=1105411.92..1125012.38 rows=7840183 width=4) (actual time=6.613..7.691 rows=1109 loops=1)
Sort Key: projects.id
Sort Method: quicksort Memory: 100kB
Buffers: shared hit=2478 read=10
I/O Timings: read=1.477
-> Nested Loop (cost=288.55..889.20 rows=7840183 width=4) (actual time=1.719..6.106 rows=1109 loops=1)
Buffers: shared hit=2478 read=10
I/O Timings: read=1.477
-> HashAggregate (cost=288.11..290.11 rows=200 width=4) (actual time=1.698..1.780 rows=187 loops=1)
Group Key: namespaces_1.id
Buffers: shared hit=884
-> CTE Scan on base_and_descendants namespaces_1 (cost=261.43..277.85 rows=821 width=4) (actual time=0.009..1.583 rows=187 loops=1)
Buffers: shared hit=884
CTE base_and_descendants
-> Recursive Union (cost=0.00..261.43 rows=821 width=4) (actual time=0.007..1.478 rows=187 loops=1)
Buffers: shared hit=884
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=1)
-> Nested Loop (cost=0.56..24.50 rows=82 width=4) (actual time=0.041..0.265 rows=37 loops=5)
Buffers: shared hit=884
-> WorkTable Scan on base_and_descendants (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.007 rows=37 loops=5)
-> Index Only Scan using index_namespaces_on_parent_id_and_id on public.namespaces (cost=0.56..2.35 rows=8 width=8) (actual time=0.005..0.006 rows=1 loops=187)
Index Cond: (namespaces.parent_id = base_and_descendants.id)
Heap Fetches: 4
Buffers: shared hit=884
-> Index Only Scan using index_projects_on_namespace_id_and_id on public.projects (cost=0.43..2.81 rows=19 width=8) (actual time=0.010..0.020 rows=6 loops=187)
Index Cond: (projects.namespace_id = namespaces_1.id)
Heap Fetches: 64
Buffers: shared hit=1594 read=10
I/O Timings: read=1.477
-> CTE Scan on restricted_lfs_objects (cost=0.00..19985.62 rows=999281 width=4) (actual time=14057.126..14057.126 rows=1 loops=1)
Buffers: shared hit=1687183 read=10
I/O Timings: read=1.477
-> Hash (cost=14.04..14.04 rows=1 width=4) (actual time=0.887..0.887 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
Buffers: shared hit=198
-> Index Scan using index_lfs_objects_on_file_store on public.lfs_objects (cost=0.56..14.04 rows=1 width=4) (actual time=0.886..0.886 rows=0 loops=1)
Index Cond: (lfs_objects.file_store = 1)
Filter: ((lfs_objects.id >= 1) AND (lfs_objects.id <= 1000))
Rows Removed by Filter: 34
Buffers: shared hit=198
Before - Selective sync by shard, local files only
EXPLAIN WITH "restricted_lfs_objects" AS (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."repository_storage" = 'nfs-file22'
)
)
SELECT
"lfs_objects"."id"
FROM
"restricted_lfs_objects"
INNER JOIN "lfs_objects" ON "restricted_lfs_objects"."lfs_object_id" = "lfs_objects"."id"
WHERE
"lfs_objects"."id" BETWEEN 1
AND 1000
AND "lfs_objects"."file_store" = 1;
Hash Join (cost=1633761.09..1656369.82 rows=1 width=4) (actual time=20379.508..20379.508 rows=0 loops=1)
Hash Cond: (restricted_lfs_objects.lfs_object_id = lfs_objects.id)
Buffers: shared hit=1860783 read=227212
I/O Timings: read=4252.929
CTE restricted_lfs_objects
-> HashAggregate (cost=1623754.23..1633747.04 rows=999281 width=4) (actual time=20373.287..20373.287 rows=1 loops=1)
Group Key: lfs_objects_projects.lfs_object_id
Buffers: shared hit=1860783 read=227014
I/O Timings: read=4247.364
-> Merge Join (cost=74.95..1620596.42 rows=1263123 width=4) (actual time=4702.154..19596.663 rows=1789938 loops=1)
Merge Cond: (lfs_objects_projects.project_id = projects.id)
Buffers: shared hit=1860783 read=227014
I/O Timings: read=4247.364
-> Index Only Scan using index_lfs_objects_projects_on_project_id_and_lfs_object_id on public.lfs_objects_projects (cost=0.56..1180621.80 rows=57017340 width=8) (actual time=0.033..11127.301 rows=57103277 loops=1)
Heap Fetches: 920237
Buffers: shared hit=1571402 read=118904
I/O Timings: read=2157.667
-> Index Only Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects (cost=0.56..284075.98 rows=347372 width=4) (actual time=754.540..3046.179 rows=345189 loops=1)
Index Cond: (projects.repository_storage = 'nfs-file22'::text)
Heap Fetches: 7709
Buffers: shared hit=289381 read=108110
I/O Timings: read=2089.697
-> CTE Scan on restricted_lfs_objects (cost=0.00..19985.62 rows=999281 width=4) (actual time=20373.292..20373.292 rows=1 loops=1)
Buffers: shared hit=1860783 read=227014
I/O Timings: read=4247.364
-> Hash (cost=14.04..14.04 rows=1 width=4) (actual time=6.170..6.170 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
Buffers: shared read=198
I/O Timings: read=5.564
-> Index Scan using index_lfs_objects_on_file_store on public.lfs_objects (cost=0.56..14.04 rows=1 width=4) (actual time=6.169..6.169 rows=0 loops=1)
Index Cond: (lfs_objects.file_store = 1)
Filter: ((lfs_objects.id >= 1) AND (lfs_objects.id <= 1000))
Rows Removed by Filter: 34
Buffers: shared read=198
I/O Timings: read=5.564
Before - Selective sync by namespace
EXPLAIN WITH "restricted_lfs_objects" AS (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."namespace_id" IN (
WITH RECURSIVE "base_and_descendants" AS (
(
SELECT
9970 AS id
)
UNION
(
SELECT
"namespaces"."id"
FROM
"namespaces",
"base_and_descendants"
WHERE
"namespaces"."parent_id" = "base_and_descendants"."id"
)
)
SELECT
"id"
FROM
"base_and_descendants" AS "namespaces"
)
)
)
SELECT
"lfs_objects"."id"
FROM
"restricted_lfs_objects"
INNER JOIN "lfs_objects" ON "restricted_lfs_objects"."lfs_object_id" = "lfs_objects"."id"
WHERE
"lfs_objects"."id" BETWEEN 1
AND 1000;
Plan with execution:
Hash Join (cost=2791344.61..2813029.47 rows=26 width=4) (actual time=16697.265..16720.292 rows=5 loops=1)
Hash Cond: (restricted_lfs_objects.lfs_object_id = lfs_objects.id)
Buffers: shared hit=1406691 read=276
I/O Timings: read=550.375
CTE restricted_lfs_objects
-> HashAggregate (cost=2781726.41..2791310.88 rows=958447 width=4) (actual time=16695.819..16711.131 rows=13690 loops=1)
Group Key: lfs_objects_projects.lfs_object_id
Buffers: shared hit=1406584 read=276
I/O Timings: read=550.375
-> Merge Semi Join (cost=1074273.85..2645610.45 rows=54446384 width=4) (actual time=561.883..16684.030 rows=17691 loops=1)
Merge Cond: (lfs_objects_projects.project_id = projects.id)
Buffers: shared hit=1406584 read=276
I/O Timings: read=550.375
-> Index Only Scan using index_lfs_objects_projects_on_project_id_and_lfs_object_id on public.lfs_objects_projects (cost=0.56..1124878.57 rows=54446384 width=8) (actual time=0.039..10162.743 rows=54457828 loops=1)
Heap Fetches: 635301
Buffers: shared hit=1404508
-> Sort (cost=1074262.66..1093332.24 rows=7627832 width=4) (actual time=561.179..562.672 rows=1073 loops=1)
Sort Key: projects.id
Sort Method: quicksort Memory: 99kB
Buffers: shared hit=2076 read=276
I/O Timings: read=550.375
-> Nested Loop (cost=364.62..1163.33 rows=7627832 width=4) (actual time=8.011..559.672 rows=1073 loops=1)
Buffers: shared hit=2076 read=276
I/O Timings: read=550.375
-> HashAggregate (cost=364.19..366.19 rows=200 width=4) (actual time=1.428..1.700 rows=178 loops=1)
Group Key: namespaces_1.id
Buffers: shared hit=678
-> CTE Scan on base_and_descendants namespaces_1 (cost=338.16..354.18 rows=801 width=4) (actual time=0.008..1.356 rows=178 loops=1)
Buffers: shared hit=678
CTE base_and_descendants
-> Recursive Union (cost=0.00..338.16 rows=801 width=4) (actual time=0.006..1.291 rows=178 loops=1)
Buffers: shared hit=678
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=1)
-> Nested Loop (cost=0.43..32.21 rows=80 width=4) (actual time=0.050..0.236 rows=35 loops=5)
Buffers: shared hit=678
-> WorkTable Scan on base_and_descendants (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.005 rows=36 loops=5)
-> Index Only Scan using index_namespaces_on_parent_id_and_id on public.namespaces (cost=0.43..3.12 rows=8 width=8) (actual time=0.005..0.006 rows=1 loops=178)
Index Cond: (namespaces.parent_id = base_and_descendants.id)
Heap Fetches: 16
Buffers: shared hit=678
-> Index Only Scan using index_projects_on_namespace_id_and_id on public.projects (cost=0.43..3.81 rows=18 width=8) (actual time=1.818..3.130 rows=6 loops=178)
Index Cond: (projects.namespace_id = namespaces_1.id)
Heap Fetches: 184
Buffers: shared hit=1398 read=276
I/O Timings: read=550.375
-> CTE Scan on restricted_lfs_objects (cost=0.00..19168.94 rows=958447 width=4) (actual time=16695.823..16716.998 rows=13690 loops=1)
Buffers: shared hit=1406584 read=276
I/O Timings: read=550.375
-> Hash (cost=23.28..23.28 rows=836 width=4) (actual time=0.567..0.568 rows=710 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 33kB
Buffers: shared hit=107
-> Index Only Scan using lfs_objects_pkey on public.lfs_objects (cost=0.56..23.28 rows=836 width=4) (actual time=0.044..0.363 rows=710 loops=1)
Index Cond: ((lfs_objects.id >= 1) AND (lfs_objects.id <= 1000))
Heap Fetches: 0
Buffers: shared hit=107
Recommendations:
:exclamation: Query processes too much data to return a relatively small number of rows. – Reduce data cardinality as early as possible during the execution, using one or several of the following techniques: new indexes, partitioning, query rewriting, denormalization. See the visualization of the plan to understand which plan nodes are the main bottlenecks. Show details
:exclamation: VACUUM ANALYZE needed – Visibility map(s) for the table(s) involved in the query are outdated. For better performance: 1) run VACUUM ANALYZE on them as a one-time solution, 2) perform autovacuum tuning as a long-term permanent solution (tuning "vacuum" part of autovacuum). Show details
Summary:
Time: 16.762 s
- planning: 1.455 ms
- execution: 16.761 s
- I/O read: 550.375 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 1406691 (~10.70 GiB) from the buffer pool
- reads: 276 (~2.20 MiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Before - Selective sync by shard
EXPLAIN WITH "restricted_lfs_objects" AS (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."repository_storage" = 'nfs-file22'
)
)
SELECT
"lfs_objects"."id"
FROM
"restricted_lfs_objects"
INNER JOIN "lfs_objects" ON "restricted_lfs_objects"."lfs_object_id" = "lfs_objects"."id"
WHERE
"lfs_objects"."id" BETWEEN 1
AND 1000;
Plan with execution:
Hash Join (cost=1548745.41..1570430.28 rows=26 width=4) (actual time=57524.294..58149.596 rows=20 loops=1)
Hash Cond: (restricted_lfs_objects.lfs_object_id = lfs_objects.id)
Buffers: shared hit=1597400 read=181302
I/O Timings: read=38299.436
CTE restricted_lfs_objects
-> HashAggregate (cost=1539127.21..1548711.68 rows=958447 width=4) (actual time=57482.052..57763.791 rows=653529 loops=1)
Group Key: lfs_objects_projects.lfs_object_id
Buffers: shared hit=1597294 read=181301
I/O Timings: read=38295.505
-> Merge Join (cost=74.50..1536058.25 rows=1227583 width=4) (actual time=11368.960..56361.451 rows=1740674 loops=1)
Merge Cond: (lfs_objects_projects.project_id = projects.id)
Buffers: shared hit=1597294 read=181301
I/O Timings: read=38295.505
-> Index Only Scan using index_lfs_objects_projects_on_project_id_and_lfs_object_id on public.lfs_objects_projects (cost=0.56..1124878.57 rows=54446384 width=8) (actual time=0.032..18411.817 rows=54436654 loops=1)
Heap Fetches: 614127
Buffers: shared hit=1299198 read=86417
I/O Timings: read=8168.729
-> Index Only Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects (cost=0.56..262051.04 rows=343964 width=4) (actual time=6780.895..31710.994 rows=345688 loops=1)
Index Cond: (projects.repository_storage = 'nfs-file22'::text)
Heap Fetches: 25826
Buffers: shared hit=298096 read=94884
I/O Timings: read=30126.775
-> CTE Scan on restricted_lfs_objects (cost=0.00..19168.94 rows=958447 width=4) (actual time=57482.061..58022.862 rows=653529 loops=1)
Buffers: shared hit=1597294 read=181301
I/O Timings: read=38295.505
-> Hash (cost=23.28..23.28 rows=836 width=4) (actual time=4.440..4.440 rows=710 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 33kB
Buffers: shared hit=106 read=1
I/O Timings: read=3.932
-> Index Only Scan using lfs_objects_pkey on public.lfs_objects (cost=0.56..23.28 rows=836 width=4) (actual time=0.072..4.227 rows=710 loops=1)
Index Cond: ((lfs_objects.id >= 1) AND (lfs_objects.id <= 1000))
Heap Fetches: 0
Buffers: shared hit=106 read=1
I/O Timings: read=3.932
Recommendations:
:exclamation: Query processes too much data to return a relatively small number of rows. – Reduce data cardinality as early as possible during the execution, using one or several of the following techniques: new indexes, partitioning, query rewriting, denormalization. See the visualization of the plan to understand which plan nodes are the main bottlenecks. Show details
:exclamation: VACUUM ANALYZE needed – Visibility map(s) for the table(s) involved in the query are outdated. For better performance: 1) run VACUUM ANALYZE on them as a one-time solution, 2) perform autovacuum tuning as a long-term permanent solution (tuning "vacuum" part of autovacuum). Show details
Summary:
Time: 58.205 s
- planning: 1.165 ms
- execution: 58.204 s
- I/O read: 38.299 s
- I/O write: 0.000 ms
Shared buffers:
- hits: 1597400 (~12.20 GiB) from the buffer pool
- reads: 181302 (~1.40 GiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
After - Queries and explains
# ee/app/models/geo_node.rb
def lfs_objects(primary_key_in:)
return LfsObject.primary_key_in(primary_key_in) unless selective_sync?
ids = LfsObjectsProject.project_id_in(projects)
.where(lfs_object_id: primary_key_in)
.select(:lfs_object_id)
.distinct
LfsObject.where(id: ids)
end
After - Query - selective sync by namespace, local files only
EXPLAIN
SELECT
"lfs_objects"."id"
FROM
"lfs_objects"
WHERE
"lfs_objects"."id" IN (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."namespace_id" IN (
WITH RECURSIVE "base_and_descendants" AS (
(
SELECT
9970 AS id
)
UNION
(
SELECT
"namespaces"."id"
FROM
"namespaces",
"base_and_descendants"
WHERE
"namespaces"."parent_id" = "base_and_descendants"."id"
)
)
SELECT
"id"
FROM
"base_and_descendants" AS "namespaces"
)
)
AND "lfs_objects_projects"."lfs_object_id" BETWEEN 1
AND 1000
)
AND "lfs_objects"."file_store" = 1
Plan with execution:
Hash Join (cost=355339.14..355510.25 rows=1 width=4) (actual time=15003.294..15003.294 rows=0 loops=1)
Hash Cond: (lfs_objects_projects.lfs_object_id = lfs_objects.id)
Buffers: shared hit=52506 read=15951 dirtied=14473
I/O Timings: read=14258.543
-> Unique (cost=355324.28..355372.99 rows=9695 width=4) (actual time=238.967..238.972 rows=5 loops=1)
Buffers: shared hit=26175
-> Sort (cost=355324.28..355348.64 rows=9742 width=4) (actual time=238.965..238.966 rows=5 loops=1)
Sort Key: lfs_objects_projects.lfs_object_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=26175
-> Hash Semi Join (cost=221396.05..354678.87 rows=9742 width=4) (actual time=95.753..238.914 rows=5 loops=1)
Hash Cond: (lfs_objects_projects.project_id = projects.id)
Buffers: shared hit=26175
-> Index Scan using index_lfs_objects_projects_on_lfs_object_id on public.lfs_objects_projects (cost=0.56..10248.20 rows=9742 width=8) (actual time=0.048..34.035 rows=24256 loops=1)
Index Cond: ((lfs_objects_projects.lfs_object_id >= 1) AND (lfs_objects_projects.lfs_object_id <= 1000))
Buffers: shared hit=23687
-> Hash (cost=889.20..889.20 rows=7840183 width=4) (actual time=4.784..4.785 rows=1109 loops=1)
Buckets: 4194304 Batches: 8 Memory Usage: 32825kB
Buffers: shared hit=2488
-> Nested Loop (cost=288.55..889.20 rows=7840183 width=4) (actual time=1.510..3.237 rows=1109 loops=1)
Buffers: shared hit=2488
-> HashAggregate (cost=288.11..290.11 rows=200 width=4) (actual time=1.483..1.525 rows=187 loops=1)
Group Key: namespaces.id
Buffers: shared hit=884
-> CTE Scan on base_and_descendants namespaces (cost=261.43..277.85 rows=821 width=4) (actual time=0.010..1.369 rows=187 loops=1)
Buffers: shared hit=884
CTE base_and_descendants
-> Recursive Union (cost=0.00..261.43 rows=821 width=4) (actual time=0.006..1.294 rows=187 loops=1)
Buffers: shared hit=884
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=1)
-> Nested Loop (cost=0.56..24.50 rows=82 width=4) (actual time=0.033..0.234 rows=37 loops=5)
Buffers: shared hit=884
-> WorkTable Scan on base_and_descendants (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.005 rows=37 loops=5)
-> Index Only Scan using index_namespaces_on_parent_id_and_id on public.namespaces namespaces_1 (cost=0.56..2.35 rows=8 width=8) (actual time=0.005..0.006 rows=1 loops=187)
Index Cond: (namespaces_1.parent_id = base_and_descendants.id)
Heap Fetches: 4
Buffers: shared hit=884
-> Index Only Scan using index_projects_on_namespace_id_and_id on public.projects (cost=0.43..2.81 rows=19 width=8) (actual time=0.005..0.008 rows=6 loops=187)
Index Cond: (projects.namespace_id = namespaces.id)
Heap Fetches: 64
Buffers: shared hit=1604
-> Hash (cost=13.49..13.49 rows=110 width=4) (actual time=14764.299..14764.299 rows=34 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared hit=26331 read=15951 dirtied=14473
I/O Timings: read=14258.543
-> Index Scan using index_lfs_objects_on_file_store on public.lfs_objects (cost=0.56..13.49 rows=110 width=4) (actual time=14691.698..14764.245 rows=34 loops=1)
Index Cond: (lfs_objects.file_store = 1)
Buffers: shared hit=26331 read=15951 dirtied=14473
I/O Timings: read=14258.543
Recommendations:
:exclamation: Query processes too much data to return a relatively small number of rows. – Reduce data cardinality as early as possible during the execution, using one or several of the following techniques: new indexes, partitioning, query rewriting, denormalization. See the visualization of the plan to understand which plan nodes are the main bottlenecks. Show details
:exclamation: Temporary buffers written – Raise work_mem (currently, the recipe is: exec alter system set work_mem to '100MB', then exec select pg_reload_conf()) Show details
:exclamation: VACUUM ANALYZE needed – Visibility map(s) for the table(s) involved in the query are outdated. For better performance: 1) run VACUUM ANALYZE on them as a one-time solution, 2) perform autovacuum tuning as a long-term permanent solution (tuning "vacuum" part of autovacuum). Show details
Summary:
Time: 15.007 s
- planning: 1.576 ms
- execution: 15.006 s
- I/O read: 14.259 s
- I/O write: 0.000 ms
Shared buffers:
- hits: 52506 (~410.20 MiB) from the buffer pool
- reads: 15951 (~124.60 MiB) from the OS file cache, including disk I/O
- dirtied: 14473 (~113.10 MiB)
- writes: 0
Temp buffers:
- reads: 80 (~640.00 KiB)
- writes: 80 (~640.00 KiB)
After - Query - selective sync by shard, local files only
EXPLAIN
SELECT
"lfs_objects"."id"
FROM
"lfs_objects"
WHERE
"lfs_objects"."id" IN (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."repository_storage" = 'nfs-file22'
)
AND "lfs_objects_projects"."lfs_object_id" BETWEEN 1
AND 1000
)
AND "lfs_objects"."file_store" = 1
Plan with execution:
Hash Join (cost=15.99..30162.19 rows=1 width=4) (actual time=118.846..118.846 rows=0 loops=1)
Hash Cond: (lfs_objects_projects.lfs_object_id = lfs_objects.id)
Buffers: shared hit=121546
-> Unique (cost=1.12..30144.60 rows=216 width=4) (actual time=1.674..118.179 rows=20 loops=1)
Buffers: shared hit=121348
-> Nested Loop (cost=1.12..30144.06 rows=216 width=4) (actual time=1.673..118.046 rows=511 loops=1)
Buffers: shared hit=121348
-> Index Scan using index_lfs_objects_projects_on_lfs_object_id on public.lfs_objects_projects (cost=0.56..10248.20 rows=9742 width=8) (actual time=0.027..26.578 rows=24256 loops=1)
Index Cond: ((lfs_objects_projects.lfs_object_id >= 1) AND (lfs_objects_projects.lfs_object_id <= 1000))
Buffers: shared hit=23687
-> Index Only Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects (cost=0.56..2.04 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=24256)
Index Cond: ((projects.id = lfs_objects_projects.project_id) AND (projects.repository_storage = 'nfs-file22'::text))
Heap Fetches: 14
Buffers: shared hit=97661
-> Hash (cost=13.49..13.49 rows=110 width=4) (actual time=0.624..0.624 rows=34 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared hit=198
-> Index Scan using index_lfs_objects_on_file_store on public.lfs_objects (cost=0.56..13.49 rows=110 width=4) (actual time=0.548..0.613 rows=34 loops=1)
Index Cond: (lfs_objects.file_store = 1)
Buffers: shared hit=198
Recommendations:
:exclamation: Query processes too much data to return a relatively small number of rows. – Reduce data cardinality as early as possible during the execution, using one or several of the following techniques: new indexes, partitioning, query rewriting, denormalization. See the visualization of the plan to understand which plan nodes are the main bottlenecks. Show details
:exclamation: VACUUM ANALYZE needed – Visibility map(s) for the table(s) involved in the query are outdated. For better performance: 1) run VACUUM ANALYZE on them as a one-time solution, 2) perform autovacuum tuning as a long-term permanent solution (tuning "vacuum" part of autovacuum). Show details
Summary:
Time: 119.992 ms
- planning: 1.088 ms
- execution: 118.904 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 121546 (~949.60 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
After - Query - selective sync by namespace
EXPLAIN
SELECT
"lfs_objects"."id"
FROM
"lfs_objects"
WHERE
"lfs_objects"."id" IN (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."namespace_id" IN (
WITH RECURSIVE "base_and_descendants" AS (
(
SELECT
9970 AS id
)
UNION
(
SELECT
"namespaces"."id"
FROM
"namespaces",
"base_and_descendants"
WHERE
"namespaces"."parent_id" = "base_and_descendants"."id"
)
)
SELECT
"id"
FROM
"base_and_descendants" AS "namespaces"
)
)
AND "lfs_objects_projects"."lfs_object_id" BETWEEN 1
AND 1000
)
Plan with execution:
Nested Loop (cost=355324.84..374947.04 rows=9695 width=4) (actual time=25534.791..25534.875 rows=5 loops=1)
Buffers: shared hit=7128 read=19087 dirtied=89
I/O Timings: read=25065.392
-> Unique (cost=355324.28..355372.99 rows=9695 width=4) (actual time=25529.762..25529.770 rows=5 loops=1)
Buffers: shared hit=7112 read=19081 dirtied=89
I/O Timings: read=25060.439
-> Sort (cost=355324.28..355348.64 rows=9742 width=4) (actual time=25529.760..25529.762 rows=5 loops=1)
Sort Key: lfs_objects_projects.lfs_object_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=7112 read=19081 dirtied=89
I/O Timings: read=25060.439
-> Hash Semi Join (cost=221396.05..354678.87 rows=9742 width=4) (actual time=25383.637..25529.708 rows=5 loops=1)
Hash Cond: (lfs_objects_projects.project_id = projects.id)
Buffers: shared hit=7109 read=19081 dirtied=89
I/O Timings: read=25060.439
-> Index Scan using index_lfs_objects_projects_on_lfs_object_id on public.lfs_objects_projects (cost=0.56..10248.20 rows=9742 width=8) (actual time=4.183..24593.330 rows=24256 loops=1)
Index Cond: ((lfs_objects_projects.lfs_object_id >= 1) AND (lfs_objects_projects.lfs_object_id <= 1000))
Buffers: shared hit=5091 read=18596 dirtied=44
I/O Timings: read=24403.123
-> Hash (cost=889.20..889.20 rows=7840183 width=4) (actual time=675.165..675.165 rows=1109 loops=1)
Buckets: 4194304 Batches: 8 Memory Usage: 32825kB
Buffers: shared hit=2018 read=485 dirtied=45
I/O Timings: read=657.316
-> Nested Loop (cost=288.55..889.20 rows=7840183 width=4) (actual time=210.092..672.920 rows=1109 loops=1)
Buffers: shared hit=2018 read=485 dirtied=45
I/O Timings: read=657.316
-> HashAggregate (cost=288.11..290.11 rows=200 width=4) (actual time=204.402..204.657 rows=187 loops=1)
Group Key: namespaces.id
Buffers: shared hit=725 read=159 dirtied=1
I/O Timings: read=201.274
-> CTE Scan on base_and_descendants namespaces (cost=261.43..277.85 rows=821 width=4) (actual time=0.010..204.213 rows=187 loops=1)
Buffers: shared hit=725 read=159 dirtied=1
I/O Timings: read=201.274
CTE base_and_descendants
-> Recursive Union (cost=0.00..261.43 rows=821 width=4) (actual time=0.006..204.094 rows=187 loops=1)
Buffers: shared hit=725 read=159 dirtied=1
I/O Timings: read=201.274
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
-> Nested Loop (cost=0.56..24.50 rows=82 width=4) (actual time=5.501..40.759 rows=37 loops=5)
Buffers: shared hit=725 read=159 dirtied=1
I/O Timings: read=201.274
-> WorkTable Scan on base_and_descendants (cost=0.00..0.20 rows=10 width=4) (actual time=0.001..0.016 rows=37 loops=5)
-> Index Only Scan using index_namespaces_on_parent_id_and_id on public.namespaces namespaces_1 (cost=0.56..2.35 rows=8 width=8) (actual time=1.011..1.087 rows=1 loops=187)
Index Cond: (namespaces_1.parent_id = base_and_descendants.id)
Heap Fetches: 4
Buffers: shared hit=725 read=159 dirtied=1
I/O Timings: read=201.274
-> Index Only Scan using index_projects_on_namespace_id_and_id on public.projects (cost=0.43..2.81 rows=19 width=8) (actual time=1.745..2.500 rows=6 loops=187)
Index Cond: (projects.namespace_id = namespaces.id)
Heap Fetches: 78
Buffers: shared hit=1293 read=326 dirtied=44
I/O Timings: read=456.042
-> Index Only Scan using lfs_objects_pkey on public.lfs_objects (cost=0.56..2.01 rows=1 width=4) (actual time=1.018..1.018 rows=1 loops=5)
Index Cond: (lfs_objects.id = lfs_objects_projects.lfs_object_id)
Heap Fetches: 0
Buffers: shared hit=16 read=6
I/O Timings: read=4.953
Recommendations:
:exclamation: Query processes too much data to return a relatively small number of rows. – Reduce data cardinality as early as possible during the execution, using one or several of the following techniques: new indexes, partitioning, query rewriting, denormalization. See the visualization of the plan to understand which plan nodes are the main bottlenecks. Show details
:exclamation: Temporary buffers written – Raise work_mem (currently, the recipe is: exec alter system set work_mem to '100MB', then exec select pg_reload_conf()) Show details
:exclamation: VACUUM ANALYZE needed – Visibility map(s) for the table(s) involved in the query are outdated. For better performance: 1) run VACUUM ANALYZE on them as a one-time solution, 2) perform autovacuum tuning as a long-term permanent solution (tuning "vacuum" part of autovacuum). Show details
Summary:
Time: 25.539 s
- planning: 1.547 ms
- execution: 25.537 s
- I/O read: 25.065 s
- I/O write: 0.000 ms
Shared buffers:
- hits: 7128 (~55.70 MiB) from the buffer pool
- reads: 19087 (~149.10 MiB) from the OS file cache, including disk I/O
- dirtied: 89 (~712.00 KiB)
- writes: 0
Temp buffers:
- reads: 80 (~640.00 KiB)
- writes: 80 (~640.00 KiB)
After - Query - selective sync by shard
EXPLAIN
SELECT
"lfs_objects"."id"
FROM
"lfs_objects"
WHERE
"lfs_objects"."id" IN (
SELECT
DISTINCT "lfs_objects_projects"."lfs_object_id"
FROM
"lfs_objects_projects"
WHERE
"lfs_objects_projects"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
WHERE
"projects"."repository_storage" = 'nfs-file22'
)
AND "lfs_objects_projects"."lfs_object_id" BETWEEN 1
AND 1000
)
Plan with execution:
Nested Loop (cost=1.69..30597.54 rows=216 width=4) (actual time=403.432..3860.354 rows=20 loops=1)
Buffers: shared hit=118202 read=3237
I/O Timings: read=3633.411
-> Unique (cost=1.12..30144.60 rows=216 width=4) (actual time=403.404..3860.063 rows=20 loops=1)
Buffers: shared hit=118111 read=3237
I/O Timings: read=3633.411
-> Nested Loop (cost=1.12..30144.06 rows=216 width=4) (actual time=403.402..3859.797 rows=511 loops=1)
Buffers: shared hit=118111 read=3237
I/O Timings: read=3633.411
-> Index Scan using index_lfs_objects_projects_on_lfs_object_id on public.lfs_objects_projects (cost=0.56..10248.20 rows=9742 width=8) (actual time=0.029..50.186 rows=24256 loops=1)
Index Cond: ((lfs_objects_projects.lfs_object_id >= 1) AND (lfs_objects_projects.lfs_object_id <= 1000))
Buffers: shared hit=23687
-> Index Only Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects (cost=0.56..2.04 rows=1 width=4) (actual time=0.156..0.156 rows=0 loops=24256)
Index Cond: ((projects.id = lfs_objects_projects.project_id) AND (projects.repository_storage = 'nfs-file22'::text))
Heap Fetches: 14
Buffers: shared hit=94424 read=3237
I/O Timings: read=3633.411
-> Index Only Scan using lfs_objects_pkey on public.lfs_objects (cost=0.56..2.09 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=20)
Index Cond: (lfs_objects.id = lfs_objects_projects.lfs_object_id)
Heap Fetches: 0
Buffers: shared hit=91
Recommendations:
:exclamation: Query processes too much data to return a relatively small number of rows. – Reduce data cardinality as early as possible during the execution, using one or several of the following techniques: new indexes, partitioning, query rewriting, denormalization. See the visualization of the plan to understand which plan nodes are the main bottlenecks. Show details
:exclamation: VACUUM ANALYZE needed – Visibility map(s) for the table(s) involved in the query are outdated. For better performance: 1) run VACUUM ANALYZE on them as a one-time solution, 2) perform autovacuum tuning as a long-term permanent solution (tuning "vacuum" part of autovacuum). Show details
Summary:
Time: 3.862 s
- planning: 1.584 ms
- execution: 3.860 s
- I/O read: 3.633 s
- I/O write: 0.000 ms
Shared buffers:
- hits: 118202 (~923.50 MiB) from the buffer pool
- reads: 3237 (~25.30 MiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
Edited by Michael Kozono