Add prefix to CI Job Tokens
Proposal
Add a prefix to CI Job tokens. Much like Personal Access Tokens with the glpat-
prefix, adding a prefix to CI job tokens would make it easier for secret detection and incident response to be effective.
Current behaviour
When a pipeline job is about to run, GitLab generates a unique token and injects it as the CI_JOB_TOKEN predefined variable.
The token has the same permissions to access the API [only on specific endpoints] as the user that caused the job to run.
The token is valid only while the pipeline job runs. After the job finishes, you cannot use the token anymore.
https://docs.gitlab.com/ee/ci/jobs/ci_job_token.html
In practice a CI_JOB_TOKEN can live for up to one month, IF the project is configured with the max build timeout AND the job runs that long. By default it's 1hr. Jobs handled by SaaS runners on GitLab.com time out after 3 hours, regardless of the timeout configured in a project.
If a threat actor is able to steal or find a leaked still-active token, they can perform the actions listed in the docs. Some high level items:
- Clone a private repo
- The attack described here (currently confidential)
- Read / publish packages
- Read artifacts, plans, environments, etc
- (Maybe) Trigger another pipeline and get another CI_JOB_TOKEN
Current Format
Currently CI_JOB_TOKEN
is constructed in the form "#{partition_id.to_s(16)}_#{Devise.friendly_token}"
:
- Partition ID could be any number, rendered in hex. E.g. partition
123 -> 7b
or321 -> 141
.- The design blueprint indicates this could be up to four characters / max
65535
. -
A spec helper indicates it could be up to
99999 -> 1869f
- The design blueprint indicates this could be up to four characters / max
- Devise friendly token is (pretty much)
^[\w-]{20}$
Proposed Format
G
itL
ab C
i B
uild T
token.
We use the term build token because that's what the model is (app/models/ci/build.rb
). An alternative is to use gljt
for j
ob t
oken which is a common name for this record, e.g. in the predefined job variables. But, seeing as it's for detection by automated means and not really supposed to be userfriendly per se, sticking to the model abbreviation convention seems like a good idea. See also: https://docs.gitlab.com/ee/development/secure_coding_guidelines.html#token-prefixes
The resulting detection regex would be /^glcbt-[\h]{1,5}_[\w-]{20}$/
glcbt
is not being used in gitlab-org (which includes gitlab-org/security-products/analyzers/secrets
).
Risks of making a change
- Breaking partitioning / uniqueness across partitions
- Likelihood: Nil. See #426137 (comment 1698221334)
Impact: TBD - corrupt database perhaps, incorrect distribution across partitions, broken builds...?- Mitigation: testing
(Alternative, not desirable) If needed we could break from the pattern of other token prefixes and instead put the new prefix in the middle:PARTITION_PREFIX_glcbt_TOKEN
. That should still be detectable given we have a static component in there.
- Breaking CI jobs
- Likelihood: Low. Unless there's something that breaks due to increased length, any GitLab consumer of CI_JOB_TOKEN should be unaffected. (No validations will be added that might reject unprefixed tokens). Nothing seemed to break when the partition prefix was added.
- Impact: broken builds, customer dissatisfaction, rollbacks, comms
- Mitigation: use group-based feature flag and test on GitLab.com gitlab-owned groups first (e.g.
gitlab-org
).
- Breaking third-party systems by adding the prefix
- Likelihood: Low. This would only occur if third parties (who shouldn't need the CI_JOB_TOKEN anyway?...) assume it takes the current form. But, again, nothing seemed to break when the partition prefix was added in Dec 2022.
- Breaking existing CI_JOB_TOKEN masking
- Likelihood: TBC
- Impact: see "current risks" above, but Very Bad
™️ - Mitigation: testing
- Breaking CI jobs by improving in-job masking
- Mitigation: do in a separate issue, if it's even needed. (CI_JOB_TOKEN is already masked)
- Making it easier for malicious entities to detect and misuse
- This risk is true for all the other prefixes we've added.
- The current risk is somewhat greater for CI Job Tokens since the current format is already somewhat predictable
- Mitigation: we add some frontend detection in the MR that introduces the feature, and we follow closely with issues to update other scanners
TODO
-
Understand what parts of the codebase are extracting the partition prefix from the token (if any) -
Validate that adding a prefix to the existing prefix is feasible -
Update Ci::Build
prefix-
Create a group-based FF & rollout issue
-
-
Create follow up issues to update our scanners