Implement Shared Runner Minute Factors
Overview
Address the:
- https://gitlab.com/gitlab-com/Product/issues/861
- https://gitlab.com/gitlab-com/Product/-/issues/862
- #197368 (closed)
Focus:
- solve engineering aspect of the story,
- does not try to solve any UX aspects of the #197368 (closed)
- do minimal changes
Proposal
We extend each shared runner configuration with two settings:
-
ci_runners.public_projects_minutes_cost_factor: float
, column with a default value of0.0
, -
ci_runners.private_projects_minutes_cost_factor: float
, column with a default value of1.0
,
Note: we consider internal
projects as a public, as they are public to logged GitLab users.
These settings are GitLab.com specific, we do not expose them to on-premise EE installations
These two settings are configurable for each shared runner, and can be modified via the administrative interface of runners by GitLab Administrator.
The logic for these factors is simple:
- if project being run is
public
orinternal
we usepublic_projects_minutes_cost_factor
, if project isprivate
we useprivate_projects_minutes_cost_factor
, - the factor describes a relative cost of each
physical minute
, we use the simple formula:namespace.accumulated_minutes = runner.minutes_public_projects_factor * build.duration
, - the factor can be any value from
0
toinf
,0
means that jobs run on a given runner are not accounted, - this allows to model different cost of a single minute for different runners (
1.0
- linux,2.0
- windows),
Outcome:
- this allows us to model runners that provide unlimited (by minutes) or limited service: effectively we would do: #197368 (comment 273073624) (in most minimal form),
- this opens us to improve UX later of deciding to change
minutes limits/quotas
on GitLab to represent something ofcredits quota
, where each job consumes a credits.
Proposal 2 – not used
We extend each shared runner configuration with two settings:
-
ci_runners.minutes_cost_factor: float
, column with a default value of11.0
, -
ci_runners.public_projects_unlimited: bool
, column with a default value oftrue
These settings are GitLab.com specific, we do not expose them to on-premise EE installations
These two settings are configurable for each shared runner, and can be modified via the administrative interface of runners by GitLab Administrator.
The logic for these factors is simple:
- if project is
public
andci_runners.public_projects_unlimited
we ignore quotas assigned to a given namespace, - the
minutes_cost_factor
describes a relative cost of eachphysical minute
, we use the simple formula:namespace.accumulated_minutes = runner.minutes_cost_factor * build.duration
, - the factor can be any value from
0
toinf
,0
means that jobs run on a given runner are not accounted, - this allows to model different cost of a single minute for different runners (
1.0
- linux,2.0
- windows),
Outcome:
- this allows us to model runners that provide unlimited (by minutes) or limited service: effectively we would do: #197368 (comment 273073624) (in most minimal form),
- this opens us to improve UX later of deciding to change
minutes limits/quotas
on GitLab to represent something ofcredits quota
, where each job consumes a credits.
How we would set another limit for all existing namespaces/users?
We have ability to change the limit over API. It means that we would simply run the script to iterate through all existing namespaces and set a limit (unless set) to a given value. This likely would be done by SRE engineer.
We don't need to make any code changes to support that, as we can do this one time thing with external tool.
Risks
Database update of accumulated minutes
Take a look at
UpdateBuildsMinutesService
Today we update all minutes as part of async job, by doing accumulation on database. Since we would significantly increase the rate of updates due to significantly more projects being accounted this might result in some impact on database. This is not likely to happen, because: 1. we update async, so ops can enqueue on Sidekiq, 2. still the majority of the load cames from GitLab.org, this would be exempt. However, we might keep it in mind, and react to it by adding process that allows to aggregate all these concurrent updates.
Docs
The documentation is handled in separate issue: #213096 (closed)