Geo: Fix sync failure retry backoff
What does this MR do and why?
Fixes sync failure retry exponential backoff.
Blobs are not affected since there are none that are "mutable", but I made the same changes for consistency (and to avoid adding overrides since they reuse the same Scheduler Worker code) and for future safety.
Resolves #469587 (closed)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Before: A registry row that is persistently failing to sync will always have retry_count: 1
After: A registry row that is persistently failing to sync will increment retry_count
each time.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
- Set up Geo
- Cause a persistent sync failure. For example on the primary GDK,
gdk stop rails-web
- Open Rails console in the secondary GDK and trigger the first failed sync:
r = Geo::ProjectRepositoryRegistry.first
r.replicator.resync
- In the secondary site,
tail -f /path/to/gdk/gitlab/log/geo.log
and wait - Notice that on the master branch, the repo gets resynced every time
RepositoryRegistrySyncWorker
runs. In Rails console you can look at the registry and see thatretry_count
doesn't change after multiple syncs. - Notice that on this branch, the repo gets resynced a few times but after 5 minutes or so you should notice it doesn't get resynced on every
RepositoryRegistrySyncWorker
run. In Rails console you can look at the registry and see thatretry_count
has increased to say 6 or so.
Edited by Michael Kozono