Improve cache upload speed on high-speed networks
Release notes
- The related merge request that addresses this issue disables transport-layer compression in gitlab-runner cache_client.go
Summary
Well, perhaps not a strict bug, but definitely a major performance degradation that seems unnecessary :-)
I have been trying to diagnose slow cache operations, and after hacking the gitlab-runner-helper binaries with more timing information I found that the download part of the cache_extractor operations can be extremely slow. This appears to be do to a combination of small buffers and the Golang built-in version of http not being optimized for bandwidth.
Steps to reproduce
- Add timing operations around the io.Copy() operation in routine download() of commands/helpers/cache_extractor.go.
- Use a local S3 server, without SSL (to enable highest-possible performance).
- When downloading a 931MB cache file with the 'mc' client outside of gitlab-runner, it takes 1.9seconds (10Gb network), so network and S3 server setup is A-OK.
Actual behavior
- Gitlab-runner takes 42 seconds to download the same file.
Expected behavior
- When I hack the code to use the Minio Golang package instead of presigned URLs + HTTP, the download takes 1.9 seconds in gitlab-runner too.
Environment description
- Using our own runners in a k8s cluster
- Ceph-backed radosgw S3 storage in the same cluster
- No SSL to improve bandwidth
Used GitLab Runner version
13.1.0 (development)
Possible fixes
If there's interest, we could probably provide a MR to use Minio instead, although there could be some extra complications to handle GCS natively too.
Edited by Darren Eastman