Timeout to prevent long-running batched background migration jobs
Some self-managed customers have reported extremely long-running batched background migration jobs on their installations. This is problematic, as it can tie up sidekiq processing and doesn't meet the expectations of how the batched migration scheduling works.
The duration of the jobs should be less than the interval
, so the jobs keep the system busy without stealing all the processing time. In the worst case, they shouldn't exceed the length of time we hold the ExclusiveLease
, or we can have multiple jobs executing in parallel.
The long-running migrations are likely a result of no configured statement_timeout
, so problematic queries can run with no limit. We should configure a statement_timeout
in the job execution, or find another means to timeout long-running jobs to limit execution length.