Dependency Proxy uses workhorse for manifest pulls
🐠 Context
The Dependency Proxy acts as a pull through cache for Docker Hub images. Images are made of two types of files: blobs and manifests. The process to download and store these files is being moved from Rails to workhorse.
!71890 (merged) moved the logic for pulling blob files to workhorse and built out the mechanisms necessary to make such requests. This MR moves the manifest downloads to workhorse.
Here is the sequence of how these files are fetched and cached (stored):
sequenceDiagram
Client->>Workhorse: GET /v2/*group_id/dependency_proxy/containers/*image/manifests/*tag
Workhorse->>Rails: GET /v2/*group_id/dependency_proxy/containers/*image/manifests/*tag
Rails->>Rails: Check DB. Is manifest persisted in cache?
alt In Cache
Rails->>Workhorse: Respond with send-url injector
Workhorse->>Client: Send the file to the client
else Not In Cache
Rails->>Rails: Generate auth token and download URL for the manifest in upstream registry
Rails->>Workhorse: Respond with send-dependency injector
Workhorse->>External Registry: Request the manifest
Container Registry->>Workhorse: Download the manifest
Workhorse->>Rails: GET /v2/*group_id/dependency_proxy/containers/*image/manifest/*tag/authorize
Rails->>Workhorse: Respond with upload instructions
Workhorse->>Client: Send the manifest file to the client with original headers
Workhorse->>Object Storage: Save the manifest file with some of it's header values
Workhorse->>Rails: Finalize the upload
end
(Thanks to @igor.drozdov for creating this fantastic sequence diagram)
🐙 What does this MR do and why?
- We introduce these features behind a feature flag:
dependency_proxy_manifest_workhorse
. Rollout issue: #344216 (closed) - We add two new routes. These routes handle the workhorse accelarated upload of the manifest files that are being pulled from the external registry (DockerHub).
- We update the workhorse code to pass the headers received from the outside registry to both the user as well as rails when it receives the file. When dealing with manifests, it is important to preserve these headers because the Docker client expects them and Rails stores them so we can provide them to the Docker client when we serve a cached manifest.
- Note that much of the logic being implemented in the controller in
manifest_via_workhorse
comes from theDependencyProxy::FindOrCreateManifestService
which will be removed after this feature flag is rolled out.
🐘 Database
This MR adds a new class method DependencyProxy::Manifest.find_by_file_name_or_digest
. This query is extracted from the existing DependencyProxy::Manifest.find_or_initialize_by_file_name_or_digest
and the query itself does not change, so I did not think this warranted a database review. I am happy to include a review if it is deemed necessary.
🎬 : Screenshots or screen recordings
These changes happen on the backend, so there is not much to be seen outside of the logs, but this is what a successful image pull looks like using this feature:
→ docker pull gdk.test:3001/asdfasdfasdf/dependency_proxy/containers/alpine:latest
latest: Pulling from asdfasdfasdf/dependency_proxy/containers/alpine
Digest: sha256:69704ef328d05a9f806b6b8502915e6a0a4faa4d72018dc42343f511490daf8a
Status: Image is up to date for gdk.test:3001/asdfasdfasdf/dependency_proxy/containers/alpine:latest
gdk.test:3001/asdfasdfasdf/dependency_proxy/containers/alpine:latest
💻 How to set up and validate locally
- Follow these docs to set up the Dependency Proxy on your GDK.
- Apply the workhorse updates by running
make gitlab-workhorse-setup && gdk restart gitlab-workhorse
in your GDK root directory. - Create a group and navigate to
Packages & Registries -> Dependency Proxy
to find the image prefix. - Log into the Dependency Proxy using a PAT:
docker login gdk.test:3000 username: root password: <personal_access_token>
- Enable the feature flag in the rails console:
Feature.enable(:dependency_proxy_manifest_workhorse)
- Start viewing rails logs
- Pull an image through the dependency proxy:
# use your image prefix, it should look like docker pull gdk.test:3000/<group_path>/dependency_proxy/containers/alpine:latest
- The image pull should be successful.
- Looking at the rails logs, you should see requests for:
Started POST "/v2/<group_path>/dependency_proxy/containers/alpine/manifests/latest/upload/authorize" Started POST "/v2/<group_path/dependency_proxy/containers/alpine/manifests/latest/upload"
- Use
docker images
to find the IMAGE ID of the image you pulled and then remove it from your local machine's cache:docker rmi -f 14119a10abf4
- Pull the image again. This time in the rails logs you should not see the two
/upload
requests since the manifest will be pulled from the cache. You also should not see any manifests being inserted, but you should see anUPDATE
for the existing one.
✏ MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #335560 (closed)