Re-compute issue relative position on project import
What does this MR do?
This MR is related to #276483 (closed) performance issues we see recently with issues table. Because of high relative position we see a high number of updates happening more and more often aiming to rebalance issues relative position. One cause for that is the issues reaching end of integer rage too fast. This MR fixes one root cause of that.
When exporting project issues we can have very high relative positions that can generate big gaps in relative positions when re-importing the project. To improve that we export project issues sorted by relative position ascendingly and because we know the issues are sorted, on import we can set the new imported issues relative position to current max(relative_position) + issue index * IDEAL_DISTANCE
Context: We need to sort by relative position before inserting so that we can rely on issue positions to move the issues at the end of the position range in an hierarchy but still preserve the relative issues position within the project.
Example:
if for instance we have max relative position in the hierarchy is 100_000
but import issues that have relative positions starting at 1_000_000
we do not want to create this huge gap, but we rather want to start imported issues positions from 100_513, 101_026, ...
as a follow-up when we actually improve the rebalancing code, we may want to check if at the import time we reach a high enough relative position, or maybe even reach the end of the range(-2147483648, 2147483647) we would need to trigger a rebalancing.
Database
Batching ordered by relative position
- with the index: https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/3595/commands/11976
Without new index:
- First batch(cold cache and warm cache plans/times)
SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100
--- cold cache
Limit (cost=112466.23..112477.90 rows=100 width=1301) (actual time=18896.732..18914.834 rows=100 loops=1)
Buffers: shared hit=11010 read=56845 dirtied=541
I/O Timings: read=54957.719
-> Gather Merge (cost=112466.23..120832.28 rows=71704 width=1301) (actual time=18896.730..18914.816 rows=100 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=11010 read=56845 dirtied=541
I/O Timings: read=54957.719
-> Sort (cost=111466.21..111555.84 rows=35852 width=1301) (actual time=18854.675..18854.696 rows=68 loops=3)
Sort Key: issues.relative_position, issues.id
Sort Method: top-N heapsort Memory: 330kB
Buffers: shared hit=11010 read=56845 dirtied=541
I/O Timings: read=54957.719
-> Parallel Index Scan using index_issues_on_project_id_and_iid on public.issues (cost=0.57..110095.97 rows=35852 width=1301) (actual time=5.028..18758.873 rows=28430 loops=3)
Index Cond: (issues.project_id = 278964)
Buffers: shared hit=10992 read=56845 dirtied=541
I/O Timings: read=54957.719
Time: 18.920 s
- planning: 5.562 ms
- execution: 18.915 s (estimated* for prod: 0.773...17.265 s)
- I/O read: 54.958 s
- I/O write: N/A
Shared buffers:
- hits: 11010 (~86.00 MiB) from the buffer pool
- reads: 56845 (~444.10 MiB) from the OS file cache, including disk I/O
- dirtied: 541 (~4.20 MiB)
- writes: 0
--- warm cache
Limit (cost=115226.03..115237.69 rows=100 width=1301) (actual time=137.848..174.599 rows=100 loops=1)
Buffers: shared hit=67657
-> Gather Merge (cost=115226.03..123813.29 rows=73600 width=1301) (actual time=137.846..174.578 rows=100 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=67657
-> Sort (cost=114226.00..114318.00 rows=36800 width=1301) (actual time=131.744..131.761 rows=62 loops=3)
Sort Key: issues.relative_position, issues.id
Sort Method: top-N heapsort Memory: 351kB
Buffers: shared hit=67657
-> Parallel Index Scan using index_issues_on_project_id_and_iid on public.issues (cost=0.57..112819.53 rows=36800 width=1301) (actual time=0.036..101.056 rows=28430 loops=3)
Index Cond: (issues.project_id = 278964)
Buffers: shared hit=67639
Time: 175.130 ms
- planning: 0.447 ms
- execution: 174.683 ms
- I/O read: N/A
- I/O write: N/A
Shared buffers:
- hits: 67657 (~528.60 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
-
consequent batches with issue relative position(these are the types of queries that batching runs the most)
-
without union optimization
SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND (("issues"."id" > 14011 AND "issues"."relative_position" = 26121343) OR ("issues"."relative_position" > 26121343) OR ("issues"."relative_position" IS NULL)) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100 Limit (cost=112260.44..112272.11 rows=100 width=1301) (actual time=130.979..163.186 rows=100 loops=1) Buffers: shared hit=67681 -> Gather Merge (cost=112260.44..117728.29 rows=46864 width=1301) (actual time=130.976..163.169 rows=100 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=67681 -> Sort (cost=111260.41..111318.99 rows=23432 width=1301) (actual time=124.445..124.458 rows=64 loops=3) Sort Key: issues.relative_position, issues.id Sort Method: top-N heapsort Memory: 313kB Buffers: shared hit=67681 -> Parallel Index Scan using index_issues_on_project_id_and_iid on public.issues (cost=0.57..110364.86 rows=23432 width=1301) (actual time=1.573..109.286 rows=12375 loops=3) Index Cond: (issues.project_id = 278964) Filter: (((issues.id > 14011) AND (issues.relative_position = 26121343)) OR (issues.relative_position > 26121343) OR (issues.relative_position IS NULL)) Rows Removed by Filter: 16055 Buffers: shared hit=67663 Time: 163.954 ms - planning: 0.688 ms - execution: 163.266 ms - I/O read: N/A - I/O write: N/A Shared buffers: - hits: 67681 (~528.80 MiB) from the buffer pool - reads: 0 from the OS file cache, including disk I/O - dirtied: 0 - writes: 0
-
with union optimization, uses existing
idx_issues_on_project_id_and_rel_position_and_state_id_and_id
explain SELECT "issues".* FROM ((SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."id" > 14011 AND "issues"."relative_position" = 26121343) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100) UNION ALL (SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."relative_position" > 26121343) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100) UNION ALL (SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."relative_position" IS NULL) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100)) issues ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100 Limit (cost=89275.51..89278.05 rows=100 width=1303) (actual time=131.483..131.566 rows=100 loops=1) Buffers: shared hit=32335 -> Merge Append (cost=89275.51..89280.61 rows=201 width=1303) (actual time=131.481..131.545 rows=100 loops=1) Sort Key: issues.relative_position, issues.id Buffers: shared hit=32335 -> Sort (cost=3.62..3.63 rows=1 width=1303) (actual time=0.078..0.081 rows=0 loops=1) Sort Key: issues.relative_position, issues.id Sort Method: quicksort Memory: 25kB Buffers: shared hit=4 -> Limit (cost=3.60..3.60 rows=1 width=1303) (actual time=0.073..0.076 rows=0 loops=1) Buffers: shared hit=4 -> Sort (cost=3.60..3.60 rows=1 width=1303) (actual time=0.073..0.074 rows=0 loops=1) Sort Key: issues.id Sort Method: quicksort Memory: 25kB Buffers: shared hit=4 -> Index Scan using idx_issues_on_project_id_and_rel_position_and_state_id_and_id on public.issues (cost=0.57..3.59 rows=1 width=1303) (actual time=0.067..0.067 rows=0 loops=1) Index Cond: ((issues.project_id = 278964) AND (issues.relative_position = 26121343) AND (issues.id > 14011)) Buffers: shared hit=4 -> Limit (cost=36849.32..36849.57 rows=100 width=1303) (actual time=131.352..131.392 rows=100 loops=1) Buffers: shared hit=32327 -> Sort (cost=36849.32..36918.30 rows=27591 width=1303) (actual time=131.351..131.372 rows=100 loops=1) Sort Key: issues_1.relative_position, issues_1.id Sort Method: top-N heapsort Memory: 160kB Buffers: shared hit=32327 -> Index Scan using idx_issues_on_project_id_and_rel_position_and_state_id_and_id on public.issues issues_1 (cost=0.57..35794.81 rows=27591 width=1303) (actual time=0.016..93.520 rows=37621 loops=1) Index Cond: ((issues_1.project_id = 278964) AND (issues_1.relative_position > 26121343)) Buffers: shared hit=32327 -> Limit (cost=52422.54..52422.79 rows=100 width=1303) (actual time=0.049..0.049 rows=0 loops=1) Buffers: shared hit=4 -> Sort (cost=52422.54..52520.75 rows=39285 width=1303) (actual time=0.047..0.047 rows=0 loops=1) Sort Key: issues_2.relative_position, issues_2.id Sort Method: quicksort Memory: 25kB Buffers: shared hit=4 -> Index Scan using idx_issues_on_project_id_and_rel_position_and_state_id_and_id on public.issues issues_2 (cost=0.57..50921.09 rows=39285 width=1303) (actual time=0.031..0.031 rows=0 loops=1) Index Cond: ((issues_2.project_id = 278964) AND (issues_2.relative_position IS NULL)) Buffers: shared hit=4 Time: 133.144 ms - planning: 1.373 ms - execution: 131.771 ms - I/O read: N/A - I/O write: N/A Shared buffers: - hits: 32335 (~252.60 MiB) from the buffer pool - reads: 0 from the OS file cache, including disk I/O - dirtied: 0 - writes: 0
-
without union optimization
-
batching issues with null position - uses the existing rel position index
idx_issues_on_project_id_and_rel_position_and_state_id_and_id
SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."id" > 20932 AND "issues"."relative_position" IS NULL) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100
Limit (cost=55145.69..55145.94 rows=100 width=1301) (actual time=13.074..13.077 rows=0 loops=1)
Buffers: shared hit=3 read=4 dirtied=1
I/O Timings: read=12.834
-> Sort (cost=55145.69..55248.89 rows=41281 width=1301) (actual time=13.072..13.074 rows=0 loops=1)
Sort Key: issues.relative_position, issues.id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=3 read=4 dirtied=1
I/O Timings: read=12.834
-> Index Scan using idx_issues_on_project_id_and_rel_position_and_state_id_and_id on public.issues (cost=0.57..53567.96 rows=41281 width=1301) (actual time=13.063..13.065 rows=0 loops=1)
Index Cond: ((issues.project_id = 278964) AND (issues.relative_position IS NULL) AND (issues.id > 20932))
Buffers: shared hit=3 read=4 dirtied=1
I/O Timings: read=12.834
Time: 13.633 ms
- planning: 0.497 ms
- execution: 13.136 ms
- I/O read: 12.834 ms
- I/O write: N/A
Shared buffers:
- hits: 3 (~24.00 KiB) from the buffer pool
- reads: 4 (~32.00 KiB) from the OS file cache, including disk I/O
- dirtied: 1 (~8.00 KiB)
- writes: 0
With new index:
- Adding the index:
exec CREATE INDEX idx_issues_on_project_id_and_rel_asc_and_id ON public.issues USING btree (project_id ASC, relative_position ASC NULLS LAST, id ASC);
% time seconds wait_event
------ ------------ -----------------------------
89.52 1609.769369 IO.DataFileRead
5.81 104.449480 Running
1.77 31.878622 IO.DataFileExtend
1.52 27.393409 IO.BufFileWrite
0.72 13.031209 LWLock.WALWriteLock
0.25 4.548676 IO.DataFileWrite
0.24 4.233695 IO.BufFileRead
0.13 2.326615 IO.WALWrite
0.03 0.459600 LWLock.WALBufMappingLock
0.01 0.167945 IO.SLRURead
0.00 0.059314 LWLock.buffer_mapping
------ ------------ -----------------------------
100.00 1798.317934
The query has been executed. Duration: 1798.318 s (estimated* for prod: 178.134...1645.562 s)
⠀* Estimated timing for production (experimental).
-
First batch(warm cache) - uses the first part of the index condition
Index Cond: (issues.project_id = 278964)
SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100
Limit (cost=0.57..128.83 rows=100 width=1301) (actual time=0.347..1.665 rows=100 loops=1)
Buffers: shared hit=100 read=4
I/O Timings: read=0.230
-> Index Scan using idx_issues_on_project_id_and_rel_asc_and_id on public.issues (cost=0.57..113285.23 rows=88320 width=1301) (actual time=0.345..1.637 rows=100 loops=1)
Index Cond: (issues.project_id = 278964)
Buffers: shared hit=100 read=4
I/O Timings: read=0.230
Time: 7.665 ms
- planning: 5.949 ms
- execution: 1.716 ms
- I/O read: 0.230 ms
- I/O write: N/A
Shared buffers:
- hits: 100 (~800.00 KiB) from the buffer pool
- reads: 4 (~32.00 KiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
-
consequent batches with issue relative position(these are the types of queries that batching would run the most) - I do not see much difference from using the new index or the existing
index_issues_on_project_id_and_iid
index-
without union optimization
SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND (("issues"."id" > 14011 AND "issues"."relative_position" = 26121343) OR ("issues"."relative_position" > 26121343) OR ("issues"."relative_position" IS NULL)) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100 Limit (cost=0.57..197.97 rows=100 width=1301) (actual time=175.645..175.872 rows=100 loops=1) Buffers: shared hit=39907 read=185 I/O Timings: read=50.427 -> Index Scan using idx_issues_on_project_id_and_rel_asc_and_id on public.issues (cost=0.57..113947.63 rows=57723 width=1301) (actual time=175.642..175.852 rows=100 loops=1) Index Cond: (issues.project_id = 278964) Filter: (((issues.id > 14011) AND (issues.relative_position = 26121343)) OR (issues.relative_position > 26121343) OR (issues.relative_position IS NULL)) Rows Removed by Filter: 48164 Buffers: shared hit=39907 read=185 I/O Timings: read=50.427 Time: 176.731 ms - planning: 0.787 ms - execution: 175.944 ms - I/O read: 50.427 ms - I/O write: N/A Shared buffers: - hits: 39907 (~311.80 MiB) from the buffer pool - reads: 185 (~1.40 MiB) from the OS file cache, including disk I/O - dirtied: 0 - writes: 0
-
with union optimization, uses the new index
idx_issues_on_project_id_and_rel_asc_and_id
explain SELECT "issues".* FROM ((SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."id" > 14011 AND "issues"."relative_position" = 26121343) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100) UNION ALL (SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."relative_position" > 26121343) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100) UNION ALL (SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."relative_position" IS NULL) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100)) issues ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100 Limit (cost=4.77..135.76 rows=100 width=1303) (actual time=0.116..0.544 rows=100 loops=1) Buffers: shared hit=97 -> Merge Append (cost=4.77..268.06 rows=201 width=1303) (actual time=0.114..0.529 rows=100 loops=1) Sort Key: issues.relative_position, issues.id Buffers: shared hit=97 -> Sort (cost=3.61..3.61 rows=1 width=1303) (actual time=0.064..0.064 rows=0 loops=1) Sort Key: issues.relative_position, issues.id Sort Method: quicksort Memory: 25kB Buffers: shared hit=4 -> Limit (cost=0.57..3.59 rows=1 width=1303) (actual time=0.059..0.059 rows=0 loops=1) Buffers: shared hit=4 -> Index Scan using idx_issues_on_project_id_and_rel_asc_and_id on public.issues (cost=0.57..3.59 rows=1 width=1303) (actual time=0.058..0.058 rows=0 loops=1) Index Cond: ((issues.project_id = 278964) AND (issues.relative_position = 26121343) AND (issues.id > 14011)) Buffers: shared hit=4 -> Limit (cost=0.57..129.97 rows=100 width=1303) (actual time=0.015..0.414 rows=100 loops=1) Buffers: shared hit=89 -> Index Scan using idx_issues_on_project_id_and_rel_asc_and_id on public.issues issues_1 (cost=0.57..35703.31 rows=27591 width=1303) (actual time=0.014..0.400 rows=100 loops=1) Index Cond: ((issues_1.project_id = 278964) AND (issues_1.relative_position > 26121343)) Buffers: shared hit=89 -> Limit (cost=0.57..129.85 rows=100 width=1303) (actual time=0.035..0.035 rows=0 loops=1) Buffers: shared hit=4 -> Index Scan using idx_issues_on_project_id_and_rel_asc_and_id on public.issues issues_2 (cost=0.57..50790.59 rows=39285 width=1303) (actual time=0.035..0.035 rows=0 loops=1) Index Cond: ((issues_2.project_id = 278964) AND (issues_2.relative_position IS NULL)) Buffers: shared hit=4 Time: 7.836 ms - planning: 7.178 ms - execution: 0.658 ms - I/O read: N/A - I/O write: N/A Shared buffers: - hits: 97 (~776.00 KiB) from the buffer pool - reads: 0 from the OS file cache, including disk I/O - dirtied: 0 - writes: 0
-
-
batching issues with null position - uses the new index effectively, though the timings are not super different from existing rel pos index
idx_issues_on_project_id_and_rel_asc_and_id
SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 AND ("issues"."id" > 20932 AND "issues"."relative_position" IS NULL) ORDER BY relative_position asc NULLS LAST, "issues"."id" ASC LIMIT 100
Limit (cost=0.57..129.78 rows=100 width=1301) (actual time=20.379..20.380 rows=0 loops=1)
Buffers: shared hit=2 read=2
I/O Timings: read=20.242
-> Index Scan using idx_issues_on_project_id_and_rel_asc_and_id on public.issues (cost=0.57..54752.11 rows=42372 width=1301) (actual time=20.377..20.377 rows=0 loops=1)
Index Cond: ((issues.project_id = 278964) AND (issues.relative_position IS NULL) AND (issues.id > 20932))
Buffers: shared hit=2 read=2
I/O Timings: read=20.242
Time: 20.861 ms
- planning: 0.419 ms
- execution: 20.442 ms
- I/O read: 20.242 ms
- I/O write: N/A
Shared buffers:
- hits: 2 (~16.00 KiB) from the buffer pool
- reads: 2 (~16.00 KiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Screenshots (strongly suggested)
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team