Skip to content

Do not failover to outdated replicas

Sami Hiltunen requested to merge smh-failover-only-if-useful into master

When the primary was scoped to a virtual storage, it made sense to failover to another node immediately. This could enable us to accept writes again for some repositories while some repositories would still remain unwritable if the replica on the new primary was outdated. For the repositories which became writable, this is a clear win. For the repositories that had an outdated copy on the new primary, this slows down the process of accepting new writes. If the old primary comes back again, we'd now have to wait for a replication job to be applied.

With repository-specific primaries we don't have to do this anymore. We can check per repository whether or not there is a fully up to date, healthy replica available to act as the new primary. This minimizes unnecessary primary changes and speeds up recovery when the previous primary eventually comes back as we can just directly keep using it as the primary.

Additionally, the repository-specific primary elector was previously never using unassigned replicas as primaries. We don't generally want to do this as the unassigned replicas are considered extra copies that should be removed. If the unassigned replica is the only up to date replica, using it as the primary minimizes the duration when the repository can't accept writes. This is improved upon here by considering up to date, healthy, unassigned replicas as valid primaries if there are no up to date, healthy, assigned replicas. This allows us to temporarily use unassigned replicas as primaries if there are no assigned replicas to act as the primary. The logic also implies that as soon as there is an assigned replica that could act as the primary, the unassigned ones will immediately be demoted and the assigned replica promoted. While this is not a very common scenario yet, it will be more common when the assignments are shuffled around. This allows the unassigned replica to act as the primary until the repository has been moved to the new storage node and is ready to act as the primary. This behavior allows us to do rebalance the storages later without any interruptions to the write availability.

This MR also extracts the primary validation logic into a view to make it possible to reuse it in other parts of Praefect.

Closes: #3631 (closed)

Edited by Sami Hiltunen

Merge request reports

Loading