Add support for the workhorse GCS client
🏀 Context
In gitlab-org/gitlab!96891 (merged), workhorse was updated so that a google cloud storage client could be setup. This helps to have more reliable uploads and unblocks bucket encryption. See #4009 (closed).
This configuration should be used in workhorse only when:
- A consolidated object storage configuration is used.
- A
Google
provider is used. - One of these parameters is set:
google_application_default
google_json_key_string
google_json_key_location
Lastly, note that this part of workhorse is gated behind a feature flag in rails. Basically, rails will instruct workhorse to use either:
- a presigned url (this is what is used today and what is used when the feature flag is disabled)
- the workhorse google cloud storage client (used when the feature flag is enabled).
Since, the feature flag is currently disabled by default, this MR will have no impact on uploads.
🔬 What does this MR do?
- Update the
workhorse.object_storage.config
template so that if the proper conditions are detected, it will generate the correct workhorse configuration file for google cloud storage. - Update a related spec.
⛓ Related issues
This is the mirror change of this omnibus change: gitlab-org/omnibus-gitlab!6530 (merged)
🤔 How to validate this locally?
As we can see here, we have 3 different settings.
Now, we don't need all 3
. It's actually the opposite: only one of them is needed. We thus have 3 configurations to test here.
We're going to need:
- a k8s cluster ready.
- a GCS bucket.
- a google service account that can write to that account.
- a google key associated with that service and in the
json
format.
To have a look in logs, we use kail.
As we will see, only one of the parameters can be used without updating deployment files (use case
⚗ The testing scenario
We are going to keep it nice and simple and use the generic package registry. Basically, we're going to upload a dummy file to the GitLab generic package registry and assert that workhorse used its google cloud storage client to upload that file to object storage.
-
Have a project + personal access token ready.
-
Execute (from outside the omnibus instance)
$ curl --upload-file <dummy file> "http://<user>:<pat>@<base_url>/api/v4/projects/<project_id>/packages/generic/my/1.1.2/file.txt"
-
Check the workhorse logs (
$ tail -f /var/log/gitlab/gitlab-workhorse/current
), it should contain a line similar to this one:default/gitlab-webservice-default-6d9bbc8864-k25zv[gitlab-workhorse]: {"client_mode":"presigned_put","copied_bytes":8,"correlation_id":"01GWVPHHNP1HN3MV5RVV13FS1S","filename":"upload","is_local":false,"is_multipart":false,"is_remote":true,"level":"info","msg":"saved file","remote_id":"1680261826-132-0003-4213-603ba2ba17befa66fde116f5253fcb9e","remote_temp_object":"","time":"2023-03-31T11:23:47Z"}
- This is the proof that the upload was successful. Please note (the
client_mode
) that we are not using the workhorse gcs client that this MR will allow. That's because, this decision is done by rails and currently, it's behind a feature flag that is disabled by default.
- This is the proof that the upload was successful. Please note (the
Another way to confirm that the scenario went ok, is trying to download the file:
$ curl "http://<user>:<pat>@<base_url>/api/v4/projects/<project_id>/packages/generic/my/1.1.2/file.txt"
You should get the file contents back.
🐰 Going further
So you want to use the workhorse gcs client? Fine, let's enable the feature flag :
$ kubectl exec -it <gitlab-webservice pod name> -c webservice -- /bin/bash
(in the container) $ cd /srv/gitlab/
$ ./bin/rails c
irb(main):001:0> Feature.enable(:workhorse_google_client)
irb(main):002:0> exit
$ exit
Try to upload the file with curl again.
This time around, workhorse logs will show this:
default/gitlab-webservice-default-6d9bbc8864-k25zv[gitlab-workhorse]: {"client_mode":"go_cloud:Google","copied_bytes":8,"correlation_id":"01GWVQCNH2MZ5YKR43WD29TBHW","filename":"upload","is_local":false,"is_multipart":false,"is_remote":true,"level":"info","msg":"saved file","remote_id":"1680262715-199-0001-4337-4e293e8145637540b9eb6b965d95ef30","remote_temp_object":"tmp/uploads/1680262715-199-0001-4337-4e293e8145637540b9eb6b965d95ef30","time":"2023-03-31T11:38:36Z"}
Notice the client_mode
. It's go_cloud:Google
. That means that workhorse used its own GCS client to upload the file
If you still have doubts, you can always check the bucket on GCS. Your file will be there
1️⃣ google_application_default
Setting This configuration is challenging in the sense that the google libraries will check default locations in this mode.
Fortunately, one of these locations is an environment variable. As such, we can configure it and point to the json file.
To keep this simple, we're going to have a k8s secret that is the contents of the google json key file and write that secret to a specific file, then point that file with the GOOGLE_APPLICATION_CREDENTIALS
environment variable.
-
Update
charts/gitlab/charts/webservice/templates/deployment.yaml
with this:Diff
diff --git a/charts/gitlab/charts/webservice/templates/deployment.yaml b/charts/gitlab/charts/webservice/templates/deployment.yaml index 95111a72a..58017fd20 100644 --- a/charts/gitlab/charts/webservice/templates/deployment.yaml +++ b/charts/gitlab/charts/webservice/templates/deployment.yaml @@ -203,6 +203,8 @@ spec: value: '/var/opt/gitlab/templates' - name: CONFIG_DIRECTORY value: '/srv/gitlab/config' + - name: GOOGLE_APPLICATION_CREDENTIALS + value: '/etc/secret-volume/key' {{- if $.Values.metrics.enabled }} - name: prometheus_multiproc_dir value: /metrics @@ -262,6 +264,9 @@ spec: - name: webservice-secrets mountPath: '/etc/gitlab' readOnly: true + - name: secret-volume + mountPath: /etc/secret-volume + readOnly: true - name: webservice-secrets mountPath: /srv/gitlab/config/secrets.yml subPath: rails-secrets/secrets.yml @@ -359,6 +364,8 @@ spec: value: '/var/opt/gitlab/templates' - name: CONFIG_DIRECTORY value: '/srv/gitlab/config' + - name: GOOGLE_APPLICATION_CREDENTIALS + value: '/etc/secret-volume/key' {{- if .workhorse.sentryDSN }} - name: GITLAB_WORKHORSE_SENTRY_DSN value: {{ .workhorse.sentryDSN }} @@ -372,6 +379,9 @@ spec: - name: workhorse-secrets mountPath: '/etc/gitlab' readOnly: true + - name: secret-volume + mountPath: /etc/secret-volume + readOnly: true - name: shared-upload-directory mountPath: /srv/gitlab/public/uploads/tmp readOnly: false @@ -429,6 +439,9 @@ spec: - name: workhorse-config configMap: name: {{ $.Release.Name }}-workhorse-{{ .name }} + - name: secret-volume + secret: + secretName: google-key-json - name: init-webservice-secrets projected: defaultMode: 0400
-
Let's create a
rails.gcs.yml
:provider: Google google_project: <google project id> google_application_default: true
-
Let's create the object storage secret:
$ kubectl create secret generic gitlab-object-storage --from-file=connection=rails.gcs.yaml
-
Let's create a secret with the google key json file:
$ kubectl create secret generic google-key-json --from-file=key=<full path to google key json file>
-
Lastly, let's reate additional
values.yml
file to read that object storage secret (and also disable minio):global: minio: enabled: false registry: bucket: <bucket name> appConfig: object_store: enabled: true connection: secret: gitlab-object-storage key: connection lfs: bucket: <bucket name> artifacts: bucket: <bucket name> uploads: bucket: <bucket name> packages: bucket: <bucket name> backups: bucket: <bucket name>
-
Let's deploy the gitlab chart with the additional file (we use the "minikube minimum" base):
$ helm upgrade --install gitlab . --timeout 600s -f ./examples/values-minikube-minimum.yaml -f values.yml
Checking the workhorse logs ($ kail -c gitlab-workhorse
):
default/gitlab-webservice-default-675c6cddc5-9d46l[gitlab-workhorse]: {"address":"0.0.0.0:8181","level":"info","msg":"Running upstream server","network":"tcp","time":"2023-03-30T13:03:54Z"}
default/gitlab-webservice-default-675c6cddc5-9d46l[gitlab-workhorse]: {"address":"/tmp/gitlab/workhorse.sock","level":"info","msg":"Running upstream server","network":"unix","time":"2023-03-30T13:03:54Z"}
Workhorse booted normally
Let's check its config:
$ kubectl exec -it <gitlab-webservice pod name> -c gitlab-workhorse -- /bin/bash
(inside the gitlab-workhorse container) $ cat /srv/gitlab/config/workhorse-config.toml
We get this config content:
shutdown_timeout = "61s"
[redis]
URL = "redis://gitlab-redis-master.default.svc:6379"
Password = "xxx"
[object_storage]
provider = "Google"
# Google storage configuration.
[object_storage.google]
google_application_default = true
[image_resizer]
max_scaler_procs = 2
max_filesize = 250000
[[listeners]]
network = "tcp"
addr = "0.0.0.0:8181"
object.storage
and object.storage.google
sections are properly configured
The testing scenario is working with this config
2️⃣ google_json_key_string
Setting Alright, this is the easiest configuration to test because it's the one in the charts example file.
Basically, we pass the contents of the google key file.
With a k8s cluster, ready (and empty),
- Create a
rails.gcs.yaml
file with:provider: Google google_project: <google project id> google_json_key_string: | <exact contents of the json key file>
- Create a k8s secret out of that file:
$ kubectl create secret generic gitlab-object-storage --from-file=connection=rails.gcs.yaml
- Create additional
values.yml
file to read that secret (and also disable minio):global: minio: enabled: false registry: bucket: <bucket name> appConfig: object_store: enabled: true connection: secret: gitlab-object-storage key: connection lfs: bucket: <bucket name> artifacts: bucket: <bucket name> uploads: bucket: <bucket name> packages: bucket: <bucket name> backups: bucket: <bucket name>
- Let's deploy the gitlab chart with the additional file (we use the "minikube minimum" base):
$ helm upgrade --install gitlab . --timeout 600s -f ./examples/values-minikube-minimum.yaml -f values.yml
Checking the workhorse logs ($ kail -c gitlab-workhorse
):
default/gitlab-webservice-default-745f57c88d-9ck7c[gitlab-workhorse]: {"address":"0.0.0.0:8181","level":"info","msg":"Running upstream server","network":"tcp","time":"2023-03-30T11:59:48Z"}
default/gitlab-webservice-default-745f57c88d-9ck7c[gitlab-workhorse]: {"address":"/tmp/gitlab/workhorse.sock","level":"info","msg":"Running upstream server","network":"unix","time":"2023-03-30T11:59:48Z"}
Workhorse was able to boot normally
Let's check its config:
$ kubectl exec -it <gitlab-webservice pod name> -c gitlab-workhorse -- /bin/bash
(inside the gitlab-workhorse container) $ cat /srv/gitlab/config/workhorse-config.toml
We get this config content:
shutdown_timeout = "61s"
[redis]
URL = "redis://gitlab-redis-master.default.svc:6379"
Password = "xxx"
[object_storage]
provider = "Google"
# Google storage configuration.
[object_storage.google]
google_json_key_string = '''
<exact google key json file contents>
'''
[image_resizer]
max_scaler_procs = 2
max_filesize = 250000
[[listeners]]
network = "tcp"
addr = "0.0.0.0:8181"
That's the expected config for object_storage
and object_storage.google
.
The testing scenario is working with this config
3️⃣ google_json_key_location
Setting This time around this value needs to point to the location of the google key json file.
For this, we're going to use the same approach to
-
Update
charts/gitlab/charts/webservice/templates/deployment.yaml
with this:Diff
diff --git a/charts/gitlab/charts/webservice/templates/deployment.yaml b/charts/gitlab/charts/webservice/templates/deployment.yaml index 95111a72a..58017fd20 100644 --- a/charts/gitlab/charts/webservice/templates/deployment.yaml +++ b/charts/gitlab/charts/webservice/templates/deployment.yaml @@ -262,6 +264,9 @@ spec: - name: webservice-secrets mountPath: '/etc/gitlab' readOnly: true + - name: secret-volume + mountPath: /etc/secret-volume + readOnly: true - name: webservice-secrets mountPath: /srv/gitlab/config/secrets.yml subPath: rails-secrets/secrets.yml @@ -372,6 +379,9 @@ spec: - name: workhorse-secrets mountPath: '/etc/gitlab' readOnly: true + - name: secret-volume + mountPath: /etc/secret-volume + readOnly: true - name: shared-upload-directory mountPath: /srv/gitlab/public/uploads/tmp readOnly: false @@ -429,6 +439,9 @@ spec: - name: workhorse-config configMap: name: {{ $.Release.Name }}-workhorse-{{ .name }} + - name: secret-volume + secret: + secretName: google-key-json - name: init-webservice-secrets projected: defaultMode: 0400
-
Let's create a
rails.gcs.yml
:provider: Google google_project: <google project id> google_json_key_location: /etc/secret-volume/key
-
Let's create the object storage secret:
$ kubectl create secret generic gitlab-object-storage --from-file=connection=rails.gcs.yaml
-
Let's create a secret with the google key json file:
$ kubectl create secret generic google-key-json --from-file=key=<full path to google key json file>
-
Lastly, let's create additional
values.yml
file to read that object storage secret (and also disable minio):global: minio: enabled: false registry: bucket: <bucket name> appConfig: object_store: enabled: true connection: secret: gitlab-object-storage key: connection lfs: bucket: <bucket name> artifacts: bucket: <bucket name> uploads: bucket: <bucket name> packages: bucket: <bucket name> backups: bucket: <bucket name>
-
Let's deploy the gitlab chart with the additional file (we use the "minikube minimum" base):
$ helm upgrade --install gitlab . --timeout 600s -f ./examples/values-minikube-minimum.yaml -f values.yml
Checking the workhorse logs ($ kail -c gitlab-workhorse
):
default/gitlab-webservice-default-7b65945595-h4r8p[gitlab-workhorse]: {"address":"0.0.0.0:8181","level":"info","msg":"Running upstream server","network":"tcp","time":"2023-03-30T13:25:52Z"}
default/gitlab-webservice-default-7b65945595-h4r8p[gitlab-workhorse]: {"address":"/tmp/gitlab/workhorse.sock","level":"info","msg":"Running upstream server","network":"unix","time":"2023-03-30T13:25:52Z"}
Workhorse was able to boot normally
Let's check its config:
$ kubectl exec -it <gitlab-webservice pod name> -c gitlab-workhorse -- /bin/bash
(inside the gitlab-workhorse container) $ cat /srv/gitlab/config/workhorse-config.toml
We get this config content:
shutdown_timeout = "61s"
[redis]
URL = "redis://gitlab-redis-master.default.svc:6379"
Password = "xxx"
[object_storage]
provider = "Google"
# Google storage configuration.
[object_storage.google]
google_json_key_location = "/etc/secret-volume/key"
[image_resizer]
max_scaler_procs = 2
max_filesize = 250000
[[listeners]]
network = "tcp"
addr = "0.0.0.0:8181"
That's the expected config for object_storage
and object_storage.google
.
The testing scenario is working with this config
Checklist
See Definition of done.
For anything in this list which will not be completed, please provide a reason in the MR discussion.
Required
-
Merge Request Title and Description are up to date, accurate, and descriptive -
MR targeting the appropriate branch -
MR has a green pipeline on GitLab.com -
When ready for review, MR is labeled "~workflow::ready for review" per the Distribution MR workflow
Expected (please provide an explanation if not completing)
-
Test plan indicating conditions for success has been posted and passes -
Documentation created/updated -
Tests added -
Integration tests added to GitLab QA -
Equivalent MR/issue for omnibus-gitlab opened -
Validate potential values for new configuration settings. Formats such as integer 10
, duration10s
, URIscheme://user:passwd@host:port
may require quotation or other special handling when rendered in a template and written to a configuration file.