Make git-upload-pack use gitaly-hooks for pack-objects
Part of gitlab-com/gl-infra/scalability#807 (closed) and gitlab-com/gl-infra&372 (closed).
Feature flag: upload_pack_gitaly_hooks
Log entries in gitaly_hooks.log look like:
time="2021-01-22T14:26:57+01:00" level=info msg="local git command" args="[pack-objects --revs --thin --stdout --progress --delta-base-offset]"
The PostUploadPack and SSHUploadPack RPC's run git-upload-pack
on the Gitaly server. Normally, git-upload-pack
then spawns a git-pack-objects
process which contains the packfile data that will be in the response:
sequenceDiagram
participant A as Gitaly (PostUploadPack)
participant B as git-upload-pack
participant C as git-pack-objects
A->>B:fetch request
B->>C:pack request
C->>B:packfile data
B->>A:fetch response
Luckily for us, Git has a configuration option uploadpack.packobjectshook that lets us replace git-pack-objects
with a custom executable. This is a key part of the cache we are building. In this MR, we do the necessary ground work to have git-upload-pack
spawn gitaly-hooks
instead of git-pack-objects
. Inside gitaly-hooks
we then run git-pack-objects
as before; caching will follow in a later MR.
sequenceDiagram
participant A as Gitaly (PostUploadPack)
participant B as git-upload-pack
participant C as gitaly-hooks
participant D as git-pack-objects
A->>B:fetch request
B->>C:pack request
C->>D:pack request
D->>C:packfile data
C->>B:packfile data
B->>A:fetch response
There is a bug in Git that happens when you use a pack-objects hook and partial clone at the same time: git#82 (closed). In this MR we have some workaround code that handles this problem. We have submitted a fix for this bug to the Git mailing list but that will take a while and luckily we can work around it "outside" Git.