Skip to content

Geo: Avoid getting resources stuck in Queued

What does this MR do and why?

Avoids getting resources stuck in Queued.

Resolves #427792 (closed)

Note: I didn't add a test for mutable blob replicators, because none exist at this time, though this fix applies to future cases anyway.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Confirm problem in master

  1. With Geo set up locally
  2. On master branch
  3. Stop Sidekiq in the secondary site: gdk stop rails-background-jobs
  4. In Rails console in the secondary site:
    r = Geo::ProjectRepositoryRegistry.synced.first.replicator
    r.registry.pending? # should be false
    r.registry.last_synced_at # should be a datetime
    s = Geo::FrameworkRepositorySyncService.new(r)
    lease = s.exclusive_lease.try_obtain # claim the lease
    r.resync # attempt a resync
    r.registry.pending? # should be true
    r.registry.last_synced_at # should be a datetime
    s.release_lease(lease) # release the lease
  5. In the secondary site, gdk start rails-background-jobs
  6. Wait ~5 minutes to see that the registry is stuck in Queued.
  7. In Rails console in the secondary site, unstick the registry by manually resyncing it:
    r.resync # attempt a resync
    r.registry.pending? # should be false
    r.registry.last_synced_at # should be a datetime

Confirm fix

  1. Exit the running Rails console: exit
  2. Stop Sidekiq in the secondary site: gdk stop rails-background-jobs
  3. In the secondary GDK/gitlab directory, switch to this MR's branch: git checkout 427792-geo-bandaid-for-registry-rows-stuck-in-sync-state-queued
  4. In Rails console in the secondary site:
    r = Geo::ProjectRepositoryRegistry.synced.first.replicator
    r.registry.pending? # should be false
    r.registry.last_synced_at # should be a datetime
    s = Geo::FrameworkRepositorySyncService.new(r)
    lease = s.exclusive_lease.try_obtain # claim the lease
    r.resync # attempt a resync
    r.registry.pending? # should be true
    r.registry.last_synced_at # should be **nil**
    s.release_lease(lease) # release the lease
  5. The registry is not stuck in Queued
  6. In the secondary site, gdk start rails-background-jobs
  7. Wait ~5 minutes for the project to get resynced. It should disappear from Queued/In Progress in Admin Area > Geo > Sites > Project repositories
  8. In Rails console in the secondary site:
    r.registry.synced? # should be true

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #427792 (closed)

Edited by Michael Kozono

Merge request reports

Loading