Backfill cleanup schedules for old closed/merged MRs
What does this MR do?
Adds background migration that will create records corresponding records in merge_request_cleanup_schedules
for merge requests that are closed/merged.
These records will be later on queried by a cron worker so they merge request refs of those MRs will be cleaned up accordingly.
Estimate
60888857
to insert
batch_size = 10000
60888857 / 10000 = 6089
loops
Estimated times per batch:
-
9s
for select statement with10000
items -
370ms
for delete statement with10000
items Total: ~10sec per batch
2 mins delay per loop (safe for the given total time per batch)
6089 * (120+10) = 791,570s or ~9 days
Migration
== 20201103110018 ScheduleMergeRequestCleanupSchedulesBackfill: migrating =====
-- transaction_open?()
-> 0.0000s
-- index_exists?(:merge_requests, :id, {:name=>"merge_requests_state_id_temp_index", :where=>"state_id IN (2, 3)", :algorithm=>:concurrently})
-> 0.0121s
-- execute("SET statement_timeout TO 0")
-> 0.0002s
-- add_index(:merge_requests, :id, {:name=>"merge_requests_state_id_temp_index", :where=>"state_id IN (2, 3)", :algorithm=>:concurrently})
-> 0.0200s
-- execute("RESET ALL")
-> 0.0003s
== 20201103110018 ScheduleMergeRequestCleanupSchedulesBackfill: migrated (0.0671s)
Query Plans
All queries tested on #database-lab.
For querying records to enqueue background migration jobs
Query used for enqueueing background migration jobs uses each_batch
internally.
each_batch
start
Sample:
SELECT "merge_requests"."id"
FROM "merge_requests"
WHERE merge_requests.state_id IN (2, 3)
ORDER BY "merge_requests"."id" ASC LIMIT 1
Plan: https://explain.depesz.com/s/VBho
each_batch
stop
Sample:
SELECT "merge_requests"."id"
FROM "merge_requests"
WHERE merge_requests.state_id IN (2, 3) AND "merge_requests"."id" >= 1
ORDER BY "merge_requests"."id" ASC LIMIT 1 OFFSET 10000
Plan: https://explain.depesz.com/s/1SkT
each_batch
min/max
Sample:
SELECT MIN(id), MAX(id)
FROM "merge_requests"
WHERE merge_requests.state_id IN (2, 3) AND "merge_requests"."id" >= 1 AND "merge_requests"."id" < 10001
Plan: https://explain.depesz.com/s/sEFw
Insert query
Inserts 10K records per batch.
Sample:
INSERT INTO merge_request_cleanup_schedules (merge_request_id, scheduled_at, created_at, updated_at)
SELECT merge_requests.id, COALESCE(metrics.merged_at, COALESCE(metrics.latest_closed_at, merge_requests.updated_at)) + interval ‘14 days’, NOW(), NOW()
FROM “merge_requests”
LEFT JOIN merge_request_metrics metrics ON metrics.merge_request_id = merge_requests.id
WHERE (merge_requests.state_id IN (2, 3)) AND “merge_requests”.“id” BETWEEN 1 AND 10000
ON CONFLICT (merge_request_id) DO NOTHING;
Plan: https://explain.depesz.com/s/xg7k
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Related to #245263 (closed)