Praefect: handling of stale replication jobs (!2322) · Merge requests · GitLab.org / gitaly

Pavlo Strokov requested to merge ps-process-stale-replication into master Jun 25, 2020

New background job to move stale replication events to the next state: 'failed' or 'dead'. The event considered stale if the processing of the job takes too long without any health-pings. Health-ping is defined by the 'triggered_at' column of the 'replication_queue_job_lock' that exists only if replication event is 'in_progress' state. The check executed periodically and starts with start of the service. The check covers all storages at once. Check stops when the passed in context is cancelled/expired.

Closes: #2873 (closed) Follow up of the: !2321 (merged)

Edited Jul 03, 2020 by Pavlo Strokov

Praefect: handling of stale replication jobs

Merge request reports