Draft: GDK docker image build improvements
What does this MR do and why?
PoC for executable gdk image:
multi-stage docker image which allows building different components in parallel- add gdk base image and rebuild it on component or dependency changes and final gdk image to create executable image
- final image is executable artifact tagged with a particular commit sha
- pipeline setup uses gdk image using native ci
services
functionality - reduced image size:
registry.gitlab.com/gitlab-org/gitlab/gitlab-qa-gdk andrey-gdk-improvements 9c74ce32a005 2 hours ago 8.18GB
registry.gitlab.com/gitlab-org/gitlab/gitlab-qa-gdk master 86f5a5b9ece6 5 days ago 20.4GB
with image split in to base and gdk, image size is larger due to limitations of not being able to properly clear go cache:
registry.gitlab.com/gitlab-org/gitlab/gitlab-qa-gdk 2d3041bd210ad7868083659460e7700506b86013 4f0dd428355c 22 minutes ago 9.8GB
Issues/Improvements
- Creating/fetching/uploading cache takes a lot of time due to complex dockerfile and many steps/stages, might be worth exploring building with plain
buildkit
as it allows to use newer version and from my experience it has better caching handling, but we loose ability to build arm architecture (we install latest quemu via docker container), updating docker and buildx might also have some improvements, but in the end, just the network seems to be the bottleneck, takes a while to upload the image. Another option is using the approach of base image built only on master runs and just rerunninggdk install
, but this makes the setup not very portable and "correct" - gdk is not very well designed to run different stages together and then combining them (for example gitaly is always rebuilt just when running db task, this forces to pass through all the gitaly build deps between images which makes it 700mb larger)
- Couldn't make
workhorse
properly work with binding to0.0.0.0
IP address,nginx
fixes the issue but still adds another dependency (though it also adds possibility to set up https) - There are around 2gb of gems in the image, by identifying which ones are not runtime dependencies, the size could be reduced. development and test gems are also probably not necessary, but might require running gdk in production mode.
- Currently
spec
folder has to be included in the image which is almost 100mb due to some of the rake tasks failing to load ifspec
folder is not present. There are guards that check if environment is production and skip loading those tasks, so if we can run gdk in production mode, less code needs to be copied - GDK is slow to boot, about 3 minutes which will fail integrated ci healthcheck for service (30s timeout)
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Andrejs Cunskis