Backfill `type_new` column on integrations
What does this MR do?
Adds a background migration to backfill the new integrations.type_new
column based on the legacy class name in type
.
Extends the background migration RSpec matchers so we can verify the arguments for batched migration classes too.
Issue: #333507 (closed), part of the epic &2504 (closed) to rename "services" to "integrations", and specifically the child epic &6177 (closed).
The type_new
column was added in !66541 (merged).
Migration output
$ rails db:migrate:up VERSION=20210727113447
== 20210727113447 BackfillIntegrationsTypeNew: migrating ======================
unknown OID 28: failed to recognize type of 'relfrozenxid'. It will be treated as String.
unknown OID 1034: failed to recognize type of 'relacl'. It will be treated as String.
unknown OID 194: failed to recognize type of 'relpartbound'. It will be treated as String.
== 20210727113447 BackfillIntegrationsTypeNew: migrated (0.0627s) =============
$ rails db:migrate:down VERSION=20210727113447
== 20210727113447 BackfillIntegrationsTypeNew: reverting ======================
== 20210727113447 BackfillIntegrationsTypeNew: reverted (0.0131s) =============
(I'm curious about those unknown OID
errors, something weird with my DB?
The Sidekiq job for Database::BatchedBackgroundMigrationWorker
gets triggered every minute but seems to only actually run every 10 minutes, with gdk tail rails-background-jobs | grep cronjob:database_batched_background_migration
you might see some deduplicated jobs first before it executes and you can see the changes in integrations
:
2021-08-02_17:42:22.38781 rails-background-jobs : {"severity":"INFO","time":"2021-08-02T17:42:22.387Z","queue":"cronjob:database_batched_background_migration","args":[],"class":"Database::BatchedBackgroundMigrationWorker","retry":0,"backtrace":true,"version":0,"queue_namespace":"cronjob","jid":"fa2945f4476539b1a9e38219","created_at":"2021-08-02T17:42:22.386Z","meta.caller_id":"Cronjob","meta.feature_category":"database","correlation_id":"1b4813d9fe09197a81783b4abe3bfc85","idempotency_key":"resque:gitlab:duplicate:cronjob:database_batched_background_migration:592d9619e1997b640b70ce6a22f6713bc7793bb7a4e342b7380d90b691fcd6ae","duplicate-of":"9c51154e3ba659fba7a706d1","job_size_bytes":2,"pid":1129573,"job_status":"deduplicated","message":"Database::BatchedBackgroundMigrationWorker JID-fa2945f4476539b1a9e38219: deduplicated: dropped until executing","deduplication.type":"dropped until executing"}
Background Migration Details
- 1169469 records to update on gitlab.com
- batch size = 1000
- 1169469 / 1000 = 1170 batches
Estimated time per batch: Between ~90ms to ~2s for UPDATE query with 1000 items
- Example with old records (~2s): https://console.postgres.ai/shared/00b3420e-36a6-4e3a-8a3d-2202e6822ff7
- Example with newer records (~90ms): https://console.postgres.ai/shared/185dea72-92ef-4c35-bfaa-ef3592036d80
2 mins delay per batch (safe for the given total time per batch)
1170 batches * 2 min per batch = 39 hours to run all the scheduled jobs
Does this MR meet the acceptance criteria?
Conformity
-
I have included changelog trailers, or none are needed. (Does this MR need a changelog?) - [-] I have added/updated documentation, or it's not needed. (Is documentation required?)
-
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?) -
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?) -
I have self-reviewed this MR per code review guidelines. -
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines) -
I have followed the style guides. -
This change is backwards compatible across updates, or this does not apply.
Availability and Testing
-
I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.) - [-] I have tested this MR in all supported browsers, or it's not needed.
- [-] I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.
Related to #333507 (closed)