Use BG Migration to copy historic data to a partitioned table
What does this MR do?
Related issue: #218428 (closed)
Implements logic within the partitioning migration helper to copy data into a partitioned table using a BackgroundMigration
. Future writes to the original table are handled by a trigger installed on the source table as part of the migration helper, so once the BackgroundMigration
finishes, the tables should remain in sync.
Approach
The BackgroundMigration
does batched inserts directly from the source table into the partitioned table. Since the trigger is also copying writes into the partitioned table, a potential conflict could arise with the BackgroundMigration
. For that reason a FOR UPDATE
clause is used to lock the rows for writing in the source table.
Because of the write lock on the source table, the batch size should be kept reasonably small, but that must be balanced against the record count of the audit_events
table. With roughly 315M rows, a batch size of 10,000, and the minimum 2 minute delay between jobs, it's roughly 44 days to complete the migration. There also would be a need to enqueue 31.5K migration jobs to handle the batches.
To help alleviate this issue, each job is enqueued with a larger range of id
s but internally batches over that range. This should keep individual queries small without creating as much overhead in sidekiq. The downside here is that some sidekiq jobs could partially complete, but as a cleanup migration will already be required, that shouldn't pose a problem.
Execution
Total execution estimate
- Audit Events table has 315_000_000M records
- Current batch size per BG Migration is 50_000
- Current delay between background jobs is 2 minutes
Number of batches:
315_000_000 / 50_000 = 6_300 batches
Total time:
6_300 * 2 / 1_440 = 8.75 days
Batch queries to enqueue jobs
Sample batch query to find next offset in audit_events
:
explain SELECT "audit_events"."id" FROM "audit_events" WHERE "audit_events"."id" >= 250000001 ORDER BY "audit_events"."id" ASC LIMIT 1 OFFSET 50000
https://explain.depesz.com/s/dh64
Sample batch query to find batch start/stop of a batch:
explain SELECT MIN(id), MAX(id) FROM "audit_events" WHERE "audit_events"."id" >= 250000001 AND "audit_events"."id" < 250050001
https://explain.depesz.com/s/Ou5
Queries executed within each job
Sample batch query to find offset of the next sub-batch:
explain SELECT "audit_events"."id" FROM "audit_events" WHERE (id BETWEEN 250000001 AND 250050000) AND "audit_events"."id" >= 250045001 ORDER BY "audit_events"."id" ASC LIMIT 1 OFFSET 2500
https://explain.depesz.com/s/HoiT
Sample batch query to find batch start/stop of a sub-batch:
explain SELECT MIN(id), MAX(id) FROM "audit_events" WHERE (id BETWEEN 250000001 AND 250050000) AND "audit_events"."id" >= 250045001 AND "audit_events"."id" < 250047501
https://explain.depesz.com/s/9NMS
Sample sub-batch insert:
explain INSERT INTO audit_events_part (id, author_id, type, entity_id, entity_type, details, created_at, updated_at)
SELECT id, author_id, type, entity_id, entity_type, details, created_at, updated_at
FROM audit_events
WHERE id BETWEEN 250047501 AND 250050000
FOR UPDATE
ON CONFLICT (id, created_at) DO UPDATE SET author_id = EXCLUDED.author_id, type = EXCLUDED.type, entity_id = EXCLUDED.entity_id, entity_type = EXCLUDED.entity_type, details = EXCLUDED.details, updated_at = EXCLUDED.updated_at
https://explain.depesz.com/s/lZcs
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team