Skip to content

coordinator: Fix inconsistent repository sizes when using reads distribution

Patrick Steinhardt requested to merge pks-coordinator-force-route-accessors into master

Until now, we've been completely agnostic about which accessor RPC we're routing: they simply all use reads distribution. This works just fine for most of the RPCs, but leads to inconsistent results for some others where it's really hard to fix.

Two such RPCs are RepositorySize and GetObjectDiretcorySize: both of them depend on the on-disk state of the repository. This is nothing we have full control of: first, we do not assure that replicas always get packed at the same point in time. And second, even if we did, we cannot guarantee that git always ends up with the same set of packfiles. As a result, the likelihood that replicas would report different sizes is exceedingly high.

We could probably solve this by implementing some custom logic which simply routes the RPC to all replicas and then takes the mean of all returned sizes. But this is needlessly complex, and would require us to put a lot of RPC-specific behaviour into Praefect. This commit instead introduces a best-effort strategy of simply always routing both RPCs to the primary. This may still lead to discrepancies when the primary node is flapping. But that shouldn't happen too frequently, and when it does we probably have other problems than repo sizes.

Edited by Patrick Steinhardt

Merge request reports

Loading