Batch optimizer should key off affected rows
The default batching strategy for batched background migrations assumes data is evenly distributed, because it was originally designed to perform whole-table migrations. In a few recent MRs (!78393 (merged) for one), we've intended to use primary key batching to do partial updates based on the value of a type
field. These partial updates could confuse the batch optimizer, because a batch may be mostly or entirely empty of rows to update. In the previous MR, we found a gap of over 1.5M records in the table where no matching records were found, which could cause the optimizer to aggressively scale up the batch size to an unsafe level.
A custom batching strategy can be added as a workaround, but we should try to solve this generically in the framework. Rather than have the batch optimizer key off previous batch size, it should key off the number of rows affected in the previous batch. This way, the optimizer can scale more accurately based on the amount of work done by the migration.
As an additional optimization, if a batch is entirely empty, we could immediately begin the next batch, to prevent waiting for the job delay when no work is being done.