Migrate mentions from design notes to DB table
What does this MR do?
This MR contains migrations for design notes mentions.
These migrations are to be run after !19088 (merged) is merged and confirmed to be working fine on production
This MR is based on initial migration that contains common code under CreateResourceUserMention
to be used on all mentions migrations of other models, including DesignManagement::Design
.
I've pulled some estimate counts of the records we might be looking at from slack database-lab
channel.
Database benchmarks
Number of rows affected
Design notes with mentions ~700
explain select id from notes where noteable_type = ‘DesignManagement::Design’ and note like ‘%\@%’
Index Scan using index_notes_on_noteable_id_and_noteable_type on public.notes (cost=0.57..4519075.97 rows=3936 width=4) (actual time=7.361..952352.359 rows=652 loops=1)
Index Cond: ((notes.noteable_type)::text = 'DesignManagement::Design'::text)
Filter: (notes.note ~~ '%\@%'::text)
Rows Removed by Filter: 1535
Buffers: shared hit=107 read=1362816 dirtied=17
I/O Timings: read=938952.612
Migrate design notes mentions
Query plans for sql queries to migrate mentions for design notes based on database-lab
data.
-
On notes queries are very slow even when notes.id is between range of values. For instance bellow query only gets 66 matching records in a range of ~10M notes, and it takes ~18mins to execute: https://explain.depesz.com/s/oo7c
explain SELECT “notes”.“id” FROM “notes” LEFT JOIN design_user_mentions ON notes.id = design_user_mentions.note_id WHERE (note LIKE ‘%@%’ AND design_user_mentions.design_id IS NULL AND notes.noteable_type = ‘DesignManagement::Design’) AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
-
Creating a temporary index helps quite a bit:
exec CREATE INDEX CONCURRENTLY design_mentions_temp_index ON notes (id) WHERE note ~~ ‘%@%’::text AND notes.noteable_type = ‘DesignManagement::Design’ The query has been executed. Duration: 63.628 min
\di+ design_mentions_temp_index
List of relations Schema | Name | Type | Owner | Table | Size | Description --------+----------------------------+-------+--------+-------+-------+------------- public | design_mentions_temp_index | index | gitlab | notes | 40 kB | (1 row)
-
Afterwards the above query took only ~0.4ms to execute
explain SELECT “notes”.“id” FROM “notes” LEFT JOIN design_user_mentions ON notes.id = design_user_mentions.note_id WHERE (note LIKE ‘%@%’ AND design_user_mentions.design_id IS NULL AND notes.noteable_type = ‘DesignManagement::Design’) AND notes.id >= 81310000 AND notes.id < 91320000 ORDER BY notes.id
-
We will need to drop the
design_mentions_temp_index
index in a different release after background migrations are finished.
Migration Runtime estimates:
~60 mins
Design notes mentions migration:- Concurrent Index creation: ~60 mins
- Design notes with mentions count: ~700 => ~2-3 mins for the job to run
explain SELECT “notes”.“id” FROM “notes”
LEFT JOIN epic_user_mentions ON notes.id = epic_user_mentions.note_id
WHERE (note LIKE ‘%@%’ AND epic_user_mentions.epic_id IS NULL
AND notes.noteable_type = ‘Epic’ AND notes.system = false) AND notes.id >= 1 ORDER BY notes.id
Hash Left Join (cost=4.95..14.83 rows=1 width=4) (actual time=0.093..1.283 rows=699 loops=1)
Hash Cond: (notes.id = design_user_mentions.note_id)
Filter: (design_user_mentions.design_id IS NULL)
Rows Removed by Filter: 35
Buffers: shared hit=355
-> Index Only Scan using design_mentions_temp_index on public.notes (cost=0.28..9.56 rows=157 width=4) (actual time=0.048..1.098 rows=734 loops=1)
Index Cond: (notes.id >= 1)
Heap Fetches: 20
Buffers: shared hit=351
-> Hash (cost=4.30..4.30 rows=30 width=8) (actual time=0.019..0.019 rows=35 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
Buffers: shared hit=1
-> Seq Scan on public.design_user_mentions (cost=0.00..4.30 rows=30 width=8) (actual time=0.006..0.009 rows=35 loops=1)
Buffers: shared hit=1
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Closes #198325 (closed)