Skip to content

Geo: Fix VerificationStateBackfillWorker N+1 queries

What does this MR do and why?

  • Fixes VerificationStateBackfillWorker N+1 queries
    • Also avoids instantiating N+1 ActiveRecord objects
  • Fixes VerificationStateBackfillWorker deletion
    • It was attempting to delete verification state records by primary key instead of foreign key for:
      • Group wiki repositories
      • Project wiki repositories
      • Project repositories

Part of #429242 (closed)

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Validate the bug in master:

  1. Have GDKs with Geo set up
  2. Have uploads
  3. In the primary GDK/gitlab directory, tail -f log/development.log
  4. gdk psql
  5. delete from upload_states limit 10000
  6. Find in the tail, INSERT INTO "upload_states"
  7. Notice many results, and they look like INSERT INTO "upload_states" ("upload_id") VALUES (12) RETURNING "upload_id". An N+1.

Validate the fix in this branch:

  1. Have GDKs with Geo set up
  2. Have uploads
  3. In the primary GDK/gitlab directory, tail -f log/development.log
  4. gdk psql
  5. delete from upload_states limit 10000
  6. Find in the tail, INSERT INTO "upload_states"
  7. Notice one result, and it looks like INSERT INTO "upload_states" ("upload_id") VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53) ON CONFLICT DO NOTHING RETURNING "upload_id"

Query analysis

ReplicableModel.create_verification_details_for

Query

INSERT INTO "wiki_repository_states" ("project_wiki_repository_id")
        VALUES (1), (2), (3) ON CONFLICT DO NOTHING
    RETURNING
        "id";

Query plan on postgres.ai against gitlab-production-main

 ModifyTable on public.wiki_repository_states  (cost=0.00..0.05 rows=3 width=108) (actual time=130.900..131.045 rows=3 loops=1)
   Buffers: shared hit=179 read=27 dirtied=8
   I/O Timings: read=129.322 write=0.000
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.05 rows=3 width=108) (actual time=4.740..4.770 rows=3 loops=1)
         Buffers: shared hit=14 read=2 dirtied=1
         I/O Timings: read=4.562 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23956/commands/76611

ReplicableModel.delete_verification_details_for

Query

DELETE FROM "project_states"
WHERE "project_states"."project_id" IN (73, 74, 75);

Query plan on postgres.ai against gitlab-production-main

 ModifyTable on public.project_states  (cost=0.00..4.83 rows=0 width=0) (actual time=0.016..0.016 rows=0 loops=1)
   Buffers: shared hit=1
   I/O Timings: read=0.000 write=0.000
   ->  Seq Scan on public.project_states  (cost=0.00..4.83 rows=3 width=6) (actual time=0.014..0.014 rows=0 loops=1)
         Filter: (project_states.project_id = ANY ('{73,74,75}'::bigint[]))
         Rows Removed by Filter: 0
         Buffers: shared hit=1
         I/O Timings: read=0.000 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23956/commands/76612

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Michael Kozono

Merge request reports

Loading