Geo: Fix VerificationStateBackfillWorker N+1 queries
What does this MR do and why?
- Fixes VerificationStateBackfillWorker N+1 queries
- Also avoids instantiating N+1 ActiveRecord objects
- Fixes VerificationStateBackfillWorker deletion
- It was attempting to delete verification state records by primary key instead of foreign key for:
- Group wiki repositories
- Project wiki repositories
- Project repositories
- It was attempting to delete verification state records by primary key instead of foreign key for:
Part of #429242 (closed)
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
Validate the bug in master:
- Have GDKs with Geo set up
- Have uploads
- In the primary GDK/gitlab directory,
tail -f log/development.log
gdk psql
delete from upload_states limit 10000
- Find in the tail,
INSERT INTO "upload_states"
- Notice many results, and they look like
INSERT INTO "upload_states" ("upload_id") VALUES (12) RETURNING "upload_id"
. An N+1.
Validate the fix in this branch:
- Have GDKs with Geo set up
- Have uploads
- In the primary GDK/gitlab directory,
tail -f log/development.log
gdk psql
delete from upload_states limit 10000
- Find in the tail,
INSERT INTO "upload_states"
- Notice one result, and it looks like
INSERT INTO "upload_states" ("upload_id") VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53) ON CONFLICT DO NOTHING RETURNING "upload_id"
Query analysis
ReplicableModel.create_verification_details_for
Query
INSERT INTO "wiki_repository_states" ("project_wiki_repository_id")
VALUES (1), (2), (3) ON CONFLICT DO NOTHING
RETURNING
"id";
gitlab-production-main
Query plan on postgres.ai against ModifyTable on public.wiki_repository_states (cost=0.00..0.05 rows=3 width=108) (actual time=130.900..131.045 rows=3 loops=1)
Buffers: shared hit=179 read=27 dirtied=8
I/O Timings: read=129.322 write=0.000
-> Values Scan on "*VALUES*" (cost=0.00..0.05 rows=3 width=108) (actual time=4.740..4.770 rows=3 loops=1)
Buffers: shared hit=14 read=2 dirtied=1
I/O Timings: read=4.562 write=0.000
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23956/commands/76611
ReplicableModel.delete_verification_details_for
Query
DELETE FROM "project_states"
WHERE "project_states"."project_id" IN (73, 74, 75);
gitlab-production-main
Query plan on postgres.ai against ModifyTable on public.project_states (cost=0.00..4.83 rows=0 width=0) (actual time=0.016..0.016 rows=0 loops=1)
Buffers: shared hit=1
I/O Timings: read=0.000 write=0.000
-> Seq Scan on public.project_states (cost=0.00..4.83 rows=3 width=6) (actual time=0.014..0.014 rows=0 loops=1)
Filter: (project_states.project_id = ANY ('{73,74,75}'::bigint[]))
Rows Removed by Filter: 0
Buffers: shared hit=1
I/O Timings: read=0.000 write=0.000
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23956/commands/76612
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Michael Kozono