Link existing LFS objects from parent fork during uploads
What does this MR do and why?
Previously LFS uploads would always have to be reuploaded to a fork even if the parent already had received the LFS file, but this is unnecessary, wasting time and bandwidth. Consider this sequence of events:
- Push LFS file
test.bin
to project A. - Fork project A to project B.
- Push LFS file
test2.bin
to project A. - Push to project B.
When 4 happens, GitLab should be smart enough to realize that if the user has access to the parent project, then we should be able to link the LFS files in A without requesting a reupload of the file.
Relates to #297022 (closed)
How to set up and validate locally
- Enable feature flag:
Feature.enable(:lfs_auto_link_fork_source)
- Push a large LFS file
test.bin
to project A. - Fork project A to project B.
- Push an even larger LFS file
test2.bin
to project A. - Push to project B with
GIT_CURL_VERBOSE=1 GIT_TRACE=1 GIT_TRACE_PACKET=2 git push <projectB>
The push should finish quickly and not request an upload. The curl
output should show something like:
> POST /root/lfs-upload-1-fork.git/info/lfs/objects/batch HTTP/1.1
> Host: stanhu.gogitlab.com
> Accept: application/vnd.git-lfs+json; charset=utf-8
> Authorization: Basic * * * * *
> Content-Length: 203
> Content-Type: application/vnd.git-lfs+json; charset=utf-8
> User-Agent: git-lfs/2.13.3 (GitHub; darwin amd64; go 1.16.2)
>
{"operation":"upload","objects":[{"oid":"73cd9bfbe4371b4edadc1d154d59363700b54695caa094fab08d63f069f64b87","size":426285634}],"transfers":["basic","lfs-standalone-file"],"ref":{"name":"refs/heads/main"}}
And respond with something like:
< HTTP/2.0 200 OK
< Content-Length: 105
< Cache-Control: max-age=0, private, must-revalidate
< Content-Type: application/vnd.git-lfs+json; charset=utf-8
< Date: Mon, 06 Dec 2021 08:23:44 GMT
< Etag: W/"24b2e313e4d9a99dc5fc55aece061b46"
< Page-Title: GitLab
< Permissions-Policy: interest-cohort=()
< Referrer-Policy: strict-origin-when-cross-origin
< Server: nginx
< Strict-Transport-Security: max-age=63072000
< Vary: Accept
< X-Content-Type-Options: nosniff
< X-Download-Options: noopen
< X-Frame-Options: DENY
< X-Permitted-Cross-Domain-Policies: none
< X-Request-Id: 01FP7DEX2XJKGNHPMCM4BFZ56F
< X-Runtime: 0.107383
< X-Ua-Compatible: IE=edge
< X-Xss-Protection: 1; mode=block
<
00:23:44.602169 trace git-lfs: HTTP: {"objects":[{"oid":"73cd9bfbe4371b4edadc1d154d59363700b54695caa094fab08d63f069f64b87","size":426285634}]}
You should not see any Uploading LFS files
messages or headers that contain header
:
01:25:37.911564 trace git-lfs: HTTP: {"objects":[{"oid":"a28847c9cff2980c3695ce9d6eee99fa66ce018100f1fe4f5af9a78630899734","size":33847418,"actions":{"upload":{"href":"https://example.com/root/lfs-upload1.git/gitlab-lfs/objects/a28847c9cff2980c3695ce9d6eee99fa66ce018100f1fe4f5af9a78630899734/33847418","header":{"Authorization":"Basic cm9vdDpleUpoYkdjaU9pSklVekk...
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.