Optimize batch size for migrations automatically [RUN ALL RSPEC] [RUN AS-IF-FOSS]
What does this MR do?
- Issue: #328821 (closed)
- Feature flag: #328817 (closed)
The goal here is to maximize throughput of batched migrations in terms of the number of tuples updated per time unit.
This is based on what we call "time efficiency". For a single job, time efficiency is the ratio of total duration to interval. Ideally, this is close to but smaller than 1.
We use exponential smoothing (EMA) to look at the last 10 jobs and their time efficiency. With EMA, we tolerate a spiking job better than by just looking at the most recent job.
We also define upper and lower bounds for the batch size, as to not "over-do" it in any direction.
This reflects what we've been doing manually for a while - looking at recent job durations and adjusting batch_size up or down. It is considered safe: If the batch size is too high, the job will take longer but continue to break this down into queries of equal size (we have sub_batch_size
for that). When it takes slightly longer than the interval, the next cronjob round won't pick up a new job and we pause for a while until the next round.
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because behind feature flag.
-
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content