Backfill project namespaces for specific group
What does this MR do and why?
This is the first migration in potentially a series of migrations to backfill project namespaces, depending on how we see initial migrations go. This migration backfills project namespaces for gitlab-org
namespace on gitlab.com
Backfilling project namespaces migration will create a record in namespaces table for each project. Because namespaces table is at the core of the product we want to slowly backfill data.
Change details
- Backfilling project namespaces for a single group
- I've introduced a custom strategy to iterate in batches of 1000 only through group's specific projects.
- With this change the size of the iterations has been obviously reduced. We no longer see the 2M rows scans as we only need to migrate ~1.7K records.
- The migration time has been reduced from ~100 mins to ~90s as we no longer scan the entire projects table looking up the specific projects.
- Some changes to the batching strategy implementation was need so that batching strategy encapsulates the background migration instance to have access to background migration arguments. I decided to not extract the change to its own MR so that there is context on why the change was made. It adds a couple extra changed files, but hopefully it is not too bad and there is context around the change. I'm open to extracting this to a separate MR if reviewers think that wold be better.
- I've reduced the sub-batch size once again, to 25 now, so I move from 100 to 50 to 25. We seeing much less number of queries exceeded recommended times, but we are still seeing a ~17s update query !73640 (comment 793053726) . I wonder if it is pg.ai specific issue though ?
Screenshots or screen recordings
These are strongly recommended to assist reviewers and reduce the time to merge your change.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Alexandru Croitor