Skip to content

repository: Manually vote for `FetchRemote()` RPC

Patrick Steinhardt requested to merge pks-fetch-remote-voting into master

When fetching references, git-fetch(1) will by default update each fetched ref in a separate reference transaction. As a conseuence, when fetching thousands of references, it'll be happy to execute the hook thousands of times. It goes without saying that this is expensive even without doing transactional voting, but in combination it's even more so. We've thus turned off transactional voting in the past as performance had been taking a huge hit.

With git v2.31.0, a new --atomic flag was added to git-fetch(1) which causes it to update all references in a single pass, avoiding the problem. But while this flag is usable for RPCs which do force-fetches, it's not suitable for FetchRemote(): it may do unforced fetches, and it may even do mixed fetches where some refspecs are forced, while others are not. And naturally, callers rely on the fact that we update non-divergent refs while keeping divergent ones at their current value. But with --atomic, we'd convert this to an all-or-nothing fetch, where non-divergent refs would not be updated if there is any divergent one. So it's unfortunately a non-startet for FetchRemote().

To still enable transactional semantics, this commit implements manual voting on refs: after the fetch, we iterate over all refs and vote on the resulting hash. This is not an ideal solution as the fetch may have updated only parts of the local ref namespace, and furthermore it's inherently racy. But right now, we're always doing work twice: we first do the fetch on primary and all secondaries. Because none of the nodes are doing a vote, Praefect takes the safe way and schedules replication jobs from primary to secondaries, causing another update of the secondaries. So with this manual vote, we should at least be able to avoid replication for most cases, trading it for a failing RPC in case there was a race.

This is only a rough first iteration: if we see that this is racing too often, there are some refinements we can do: first, we can start only hashing refs which are actually covered by a refspec. And second, we may enable the reference-transaction hook with atomic fetches in cases where we know that either all refspecs are forced ones, or Force flag is set.

Quasi-part of #3426 (closed)

Merge request reports

Loading