Improve the batched background migration documentation
Problem to solve
During #378433 (closed), we heavily used the batched background migration feature.
During the implementation journey, there were several mis understandings about the system that documentation should clear:
Further details
We should have explanations on:
- the migrations are executed sequentially, not in parallel.
- execution is split in batches (jobs) and each batch is further split in sub batches.
- the minimum delay between two batches is
2 minutes
. - have some hints/guidelines on how to build the overall execution time estimation based on this comment.
- suggest that indexes solely used by the migration should be
- temporary
- created in post migration (before the queuing migration)
- suggest to use the database testing report to get a sense of (histograms):
- how much time a batch (job) takes to execute. This time should be < than the delay between batches.
- how much time each query is taking.
Proposal
Improve https://docs.gitlab.com/ee/development/database/batched_background_migrations.html by clearing the above points.
Who can address the issue
Anyone with a good understanding of the batched background migration feature.
Other links/references
Edited by David Fernandez