Improved Vault integration PoC
What does this MR do?
This is just a POC for the Vault integration, not to be merged. #118624 (closed)
It was proposed here, #28321 (comment 255977128), that the Vault integration should be implemented entirely on the Rails side. I think that the UX proposed in #28321 (closed) to expose a variable to the Runner with the contents of a path
has a few drawbacks for the Rails only approach.
A Vault path
stores a key-value structure, so it can multiple variables in it, which doesn't align with the UI where you can specify only one variable name:
vault kv put secret/password field_1=11111111 field_2=22222222 field_3=33333333 field_4=44444444
vault kv get secret/password
===== Data =====
Key Value
--- -----
field_1 1111111
field_2 2222222
field_3 3333333
field_4 4444444
This MR uses the vault-ruby gem to read the paths. By reading secret/password
, we receive the secrets in the data
field:
[17] pry(main)> puts client.logical.read("secret/password").to_h
{
"auth": null,
"data": {
"field_1": "111111111",
"field_2": "222222222",
"field_3": "333333333",
"field_4": "444444444"
},
"metadata": null,
"lease_duration": 2764800,
"lease_id": "",
"renewable": false,
"warnings": null,
"wrap_info": null
}
The only way for this to work is to have the user to use the same variable name in our UI as the field in the data
structure. The leaves us with a few problems:
- User must define a variable for each
data
key - We must perform a
group by path
on the variables to reduce the number of requests to the Vault server. This is not clear to the user and it might pose problems with the dynamic secret engines:
$ vault read aws/creds/my-role
Key Value
--- -----
lease_id aws/creds/my-role/f3e92392-7d9c-09c8-c921-575d62fe80d8
lease_duration 768h
lease_renewable true
access_key AKIAIOWQXTLW36DV7IEA
secret_key iASuXNKcWKFtbO8Ef0vOcgtiL6knR20EJkJTH8WI
security_token <nil>
For this example the user must define 3 variables with the same path, aws/creds/my-role
, but with access_key
, secret_key
, security_token
as the names. In this case we must perform the group by if we don't want to create 3 different sets of credentials. But this leaves us without a way for the user to inject 2 sets of distinct credentials in the job.
Another problem is that the variables are exposed to all the jobs:
- which might cause problems when using dynamic credentials from Vault. Each job will create credentials even if they are not used during the job execution.
- we are doing unnecessary requests to the Vault, which might slow down the builds.
Proposed syntax
This is why I am proposing to move the secrets definition in the YAML config to allow more flexible configurations:
build_job:
script: make
secrets:
vault:
- key: "secret/password"
prefix: "SOMETHING"
fields:
- name: "field_1"
expose_as: "PASSWORD_1" # SOMETHING_PASSWORD_1
- name: "field_2"
expose_as: "PASSWORD_2" # SOMETHING_PASSWORD_2
- name: "field_3" # SOMETHING_SECRET_PASSWORD_FIELD_3
We choose to export only the first three fields from secret/password
, each with its own name:
- the
prefix:
is optional, withVAULT
being the default value. It can accept""
to remove it. -
fields:
is also optional, we expose all the fields if it's missing -
name:
is mandatory, should be the name from the data structure -
expose_as:
should be optional and be used to construct the variable name.
The variable names are constructed using prefix + expose_as
or prefix + key + field name
if fields:
or expose_as:
are missing.
build_job:
script: make
secrets:
vault:
- key: "secret/password"
This exposes all the fields from secret/password
The implementation:
- We store the Vault token of SSL/TLS client certificate in a Rails model using encrypted attributes
- Vault can be configured at Project -> Settings -> CI/CD -> Vault integration. This is a minimal implementation without frontend validations or styling
- We read the secrets only when the Runner requests a new build and pass them as environment variables
Protected variables and environments
I'm also proposing a feature to mimic the Protected Variables
functionality by using an input in the configuration page(Protected Secrets
screenshot paths
that must be exposed only on protected jobs or on protected jobs and environments.
Example:
no-deployment:
stage: deployment
script:
- printenv | sort | grep "STAGING"
secrets:
vault:
- key: "secret/staging-secret"
staging-deployment:
stage: deployment
script:
- printenv | sort | grep "STAGING"
secrets:
vault:
- key: "secret/staging-secret"
environment:
name: staging
With secret/staging*,staging
as protected configuration in the integration page, only the staging-deployment
job would receive the values stored at secret/staging-secret
Right now the input accepts a regexp key joined by a ,
with the environment name. Different configurations are separated by a new line. There is a proposal to clean this up, but it needs frontend and UX work.
Limits
Fetching all secrets can take up to 15 seconds:
- each request has timeout set to 5 seconds
- there is also a soft timeout for all the requests set to 10 seconds.
Future work
Only the root
tokens can have a TTL of 0, meaning that they do not expire. All other tokens expire after a configured period.
- We can define an API to update the tokens
- We can add support for periodic tokens using a cron task:
At every renewal time, the TTL will be reset back to this configured period, and as long as the token is successfully renewed within each of these periods of time, it will never expire. Outside of root tokens, it is currently the only way for a token in Vault to have an unlimited lifetime.