GitLab CI should be able to use specific Geo secondary to clone from
Description
Some companies can have a lot of load from CI cloning from Primary node, for numerous reasons. This can be even worst when a few projects are hotspot.
One possible use-case for Geo could be to work as a "mirror" to lower the pressure on the primary node for the automated builds.
Proposal
User should be able to define (per project), that builds for that project should clone from specified Geo node, instead of only from master.
The replication lag could be an issue here as we could be trying to clone from a remote that is not yet in sync. We can mitigate this, by checking for the existence of the target SHA
, and if not found, fallback to fetching the missing objects from primary.
This should shift most of the heavy load to the secondary. In situations like GitLab's own build, the fact we have few initial builds in the pipeline means only those will potentially touch the primary, the next step of the pipeline will probably be 100% from the secondary, so this is a huge win.
Links / references
- Customer ticket: https://gitlab.zendesk.com/agent/tickets/81758
- Discussion in slack: https://gitlab.slack.com/archives/C32LCGC1H/p1504078299000127
Documentation blurb
Overview
What is it? Why should someone use this feature? What is the underlying (business) problem? How do you use this feature?
Use cases
BigCorp doing lots of automated tests for a single project, with many developers on the team. You don't want the CI to be competing resources with developers slowing down both.
Feature checklist
Make sure these are completed before closing the issue, with a link to the relevant commit.
-
Feature assurance -
Documentation -
Added to features.yml