git: Speed up connectivity checks
When a user pushes commits into a Git repository, then Git will do
a connectivity check to see whether it has all objects needed to satisfy
the ref updates. This check can be quite expensive depending on how many
refs the target repo has given that it is implemented as git rev-list --not --all
, which loads all objects pointed to by preexisting refs. In
repositories like gitlab-org/gitlab with about 2.3M refs, this typically
takes about 8 seconds. But the worst is that the user typically doesn't
even know what's going on given that there is no progress bar being
displayed.
We have thus upstreamed a set of patches which speed up the connectivity checks. There are two major performance optimizations:
1. We stop sorting inputs in the connectivity check, which gives a
30% speedup.
2. Instead of loading referenced objects via the object database, we
use the commit-graph. This is another 30% speedup.
In total, this improves the time required from nearly 8 to 3 seconds.
Patches have been merged to upstream's "next" branch and are thus likely going to be part of the next release. Given that this is still a few months out, this commit backports those patches such that we can reap the benefits earlier.
Closes git#92 (closed)
Changelog: performance