Migrate helper image to registry.gitlab.com
Overview
For the docker and kubernets we use a helper image so that we can clone the repository, download/upload artifact/cache without the need of the user having any tools installed.
This helper image is hosted in DockerHub and with the new DockerHub limits it causes our helper image to fail to pull like #27193 (closed). For every job that would mean that we are using up 1 of the available pull requests just for the helper image which is less than ideal for users since they already have a really low number.
Proposal
- Update our build scripts to build the image both on DockerHub and
registry.gitlab.com
, this should be done by updating ci/release_docker_images andci/build_release_windows_images.ps1
- Change the runner to default to the image in
registry.gitlab.com
providing a way for the user to change to DockerHub if there is any problem. - Stop publishing to DockerHub
Things to consider
- Bandwidth usage for
registry.gitlab.com
. We are going to start pulling the helper image for every build for GitLab.com and also for each self-hosted instance out there. For example number of start jobs on GitLab.com (internal link) for the last 24 hours it's 495k, this would mean that around ~400k pulls just for GitLab (keep in mind that this also include self hosted runners connected to GitLab.com). Now at GitLab.com we are safe from the rate limits because we have gcp mirror so this traffic will now go entirely toregistry.gitlab.com
. I suspect that the number for all the self-hosted instances is going to be much higher than GitLab.com traffic so we might be looking at around 1M pulls per day. - Some users don't have
gitlab.com
allowed in egress rules, so we would need to introduce some graceful period to roll this out until uses update their firewall rules. - Some countries are blocked by google which will prevent cloning the helper image from I understand DockerHub isn't.
Rollout
Ideally, we should roll out such a big change gradually and allow users to roll forward/backward easily. GitLab.com shared runner fleet should probably have a different rollout strategy so that we work with the ~"group::package" and SRE team.
👉 !2540 (merged)
Phase 1 - Publish the helper image to
registry.gitlab.com
- Inform users that they can override the helper image
👉 !2554 (merged)
Phase 2 - Introduce feature flag
OFF
-
OFF
use image from DockerHub -
ON
use image from GitLab registry
-
- When the feature flag is
OFF
we should add a log message to the job trace informing them that we are migrating the helper image.
👉 #27218 (closed)
Phase 3 - Update the default value of the feature flag to be
ON
, this can be considered a breaking change so we have to inform users accordingly.
👉 #27219 (closed)
Phase 4 - Remove feature flag and always default to GitLab Registry
Edited by Steve Xuereb