repository: Fix ReplicateRepository returning before RPCs have finished
When calling ReplicateRepository()
, we try to both synchronize the
gitattributes file as well as the actual repository contents. This is
being done in parallel via the errgroup
package, which allows us to
run multiple functions in parallel and then retrieve the first error
that was returned by any of them.
One feature that is not obvious is that errgroup.WithContext()
returns
a cancellable context which gets cancelled as soon as any of the invoked
functions returns an error. We're using this feature right now, which
means that if either of the synchronizing functions fails, the other one
will get cancelled.
While it sounds like a logical thing to do, it does have problems in
combination with the gRPC calls. If the context of a gRPC call gets
cancelled, then the call will get aborted. But while there is no
official documentation proving this, it seems like cancellation happens
asynchronously. So while we're cancelling the remote call, we'll return
even though the remote side hasn't yet been cancelled. Which effectively
means that it may still be running on the remote server after
ReplicateRepository()
has returned on the local client.
There doesn't seem to be any way to synchronize on the remote
cancellation, so what this commit instead does is to simply not pass the
cancellable context of the errgroup
to those functions. This has the
downside of not cancelling RPCs even if we know that they'll fail, or
that we don't care for their result. But on the other hand, it fixes
above race where we may modify the repository even after the RPC has
returned. This also fixes a test race in "TestReplicate/BadRepository".