Backup container fails to initialize when persistence is enabled
Summary
Having task runner persistence enabled causes the backup container from the cronjob to fail with the following error:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m11s default-scheduler Successfully assigned default/gitlab-task-runner-backup-1562515200-bsk22 to gke-prod-stage-default-pool-9a7ba391-4dhr
Warning FailedAttachVolume 2m11s attachdetach-controller Multi-Attach error for volume "pvc-8855fbde-9ebe-11e9-8147-42010af00050" Volume is already used by pod(s) gitlab-task-runner-544cb695cc-6j5ln
Warning FailedMount 8s kubelet, gke-prod-stage-default-pool-9a7ba391-4dhr Unable to mount volumes for pod "gitlab-task-runner-backup-1562515200-bsk22_default(a8439528-a114-11e9-8147-42010af00050)": timeout expired waiting for volumes to attach or mount for pod "default"/"gitlab-task-runner-backup-1562515200-bsk22". list of unmounted volumes=[task-runner-tmp]. list of unattached volumes=[task-runner-config task-runner-tmp init-task-runner-secrets task-runner-secrets etc-ssl-certs default-token-gdvwh]
Disabling persistence makes the backup job start successfully, however we are then not able to create a backup. Sometimes the job gets suddenly canceled (not sure if timeout?) or evicted due to low resources.
Steps to reproduce
- enable task runner persistence
- enable task runner backup cronjob
- trigger backup job
Configuration used
This is the task runner configuration we are using. If you need anything else let me know:
gitlab:
task-runner:
backups:
cron:
enabled: true
# schedule is in UTC
schedule: "0 16 * * *"
objectStorage:
backend: gcs
config:
gcpProject: ...
secret: gitlab-storage-config
key: config
persistence:
enabled: true
size: 50Gi
Current behavior
Backup container fails to initialize with a "multi-attach error" as shown in the log above
Expected behavior
Backup container initializes, starts and takes backup successfully
Versions
- Chart: v2.0.3
- Platform:
- Cloud: GKE
- Kubernetes:
- Client: v1.12.9-gke.7 (
version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.9-gke.7", GitCommit:"b6001a5d99c235723fc19342d347eee4394f2005", GitTreeState:"clean", BuildDate:"2019-06-24T19:47:32Z", GoVersion:"go1.10.8b4", Compiler:"gc", Platform:"windows/amd64"}
) - Server: v1.11.10-gke.5 (
version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-gke.5", GitCommit:"5aa3a95d828fe45aab3611dfc4ebdc0341fe1507", GitTreeState:"clean", BuildDate:"2019-05-29T17:25:39Z", GoVersion:"go1.10.8b4", Compiler:"gc", Platform:"linux/amd64"}
)
- Client: v1.12.9-gke.7 (
- Helm: (
helm version
)- Client: v2.13.1 (
&version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
) - Server: v2.13.1 (
&version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
)
- Client: v2.13.1 (
Relevant logs
(Same as above)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m11s default-scheduler Successfully assigned default/gitlab-task-runner-backup-1562515200-bsk22 to gke-prod-stage-default-pool-9a7ba391-4dhr
Warning FailedAttachVolume 2m11s attachdetach-controller Multi-Attach error for volume "pvc-8855fbde-9ebe-11e9-8147-42010af00050" Volume is already used by pod(s) gitlab-task-runner-544cb695cc-6j5ln
Warning FailedMount 8s kubelet, gke-prod-stage-default-pool-9a7ba391-4dhr Unable to mount volumes for pod "gitlab-task-runner-backup-1562515200-bsk22_default(a8439528-a114-11e9-8147-42010af00050)": timeout expired waiting for volumes to attach or mount for pod "default"/"gitlab-task-runner-backup-1562515200-bsk22". list of unmounted volumes=[task-runner-tmp]. list of unattached volumes=[task-runner-config task-runner-tmp init-task-runner-secrets task-runner-secrets etc-ssl-certs default-token-gdvwh]
Edited by Dominik Montada