Add a better handling of signal on both Helper and Build container for k8s executor in attach mode
What does this MR do?
dumb-init
is available on all GitLab Runner Helper
images for Linux like system
. When creating the containers (Helper and Build), the previous container.Command
ran cause /bin/sh
(on build), /bin/bash
(on helper) to be the PID 1 as shown below.
Mem: 4417224K used, 11973080K free, 56248K shrd, 121304K buff, 3351276K cached
CPU: 1% usr 1% sys 0% nic 97% idle 0% io 0% irq 0% sirq
Load average: 0.06 0.25 0.30 2/638 43
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
30 25 root S 1716 0% 0 0% /bin/sh /scripts-25452826-5437135578/step_script
37 0 root S 1688 0% 3 0% sh
25 24 root S 1648 0% 1 0% /bin/sh /scripts-25452826-5437135578/step_script
1 0 root S 1620 0% 1 0% /bin/sh
24 1 root S 1620 0% 0 0% sh -c (/scripts-25452826-5437135578/detect_shell_script /scripts-25452826-5437135578/step_script 2>&1 | tee -a /logs-25452826-5437135578/output.log) &
43 37 root R 1612 0% 0 0% top
36 30 root S 1608 0% 0 0% sleep 120
26 24 root S 1604 0% 0 0% tee -a /logs-25452826-5437135578/output.log
This prevents the termination signals sent to PID 1 to be propagated to the child process.
In this MR, the dumb-init
is copied in the script dir for the attach mode
and then used to run the bash shell script at the containers creation. This allows the dumb-init
to be PID 1
.
Mem: 4436728K used, 11953576K free, 56248K shrd, 121304K buff, 3352880K cached
CPU: 1% usr 1% sys 0% nic 96% idle 0% io 0% irq 0% sirq
Load average: 0.38 0.16 0.12 1/639 51
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
31 26 root S 1716 0% 2 0% /bin/sh /scripts-25452826-5437650637/step_script
45 0 root S 1688 0% 0 0% sh
26 25 root S 1648 0% 1 0% /bin/sh /scripts-25452826-5437650637/step_script
7 1 root S 1620 0% 2 0% /bin/sh
25 1 root S 1620 0% 0 0% sh -c (/scripts-25452826-5437650637/detect_shell_script /scripts-25452826-5437650637/step_script 2>&1 | tee -a /logs-25452826-5437650637/output.log) &
51 45 root R 1612 0% 2 0% top
37 31 root S 1608 0% 0 0% sleep 120
27 25 root S 1604 0% 2 0% tee -a /logs-25452826-5437650637/output.log
1 0 root S 220 0% 1 0% /scripts-25452826-5437650637/dumb-init -- sh -c if [ -x /usr/local/bin/bash ]; then exec /usr/local/bin/bash elif [ -x /usr/bin/bash ]; then exec /usr/bin/bash elif [ -x /bin/bash ]; then exec /bin/bash elif [ -x /usr/local/bin/sh ]; then exec /usr/local/bin/sh elif [ -x /usr/bin/sh ]; then exec /usr/bin/sh elif [ -x /bin/sh ]; then exec /bin/sh elif [ -x /busybox/sh ]; then exec /bus ...
Few tests were made with alpine
and ubuntu
images and the job behaves as it should.
The feature is currently hidden behind the feature flag FF_USE_DUMB_INIT_WITH_KUBERNETES_EXECUTOR
Left to do:
- More tests with a strict securityContext (Pod/container) ==> Need to generate
GitLab Runner UBI Images
for those tests - Exec Mode support (?)
- Use of random images
Exec mode support will be added in a follow-up MR
Why was this MR needed?
Enable a better handling of the termination signal for the executorkubernetes
What's the best way to test this MR?
- k8s integration tests passing locally (see this comment #36827 (comment 1619252629) for exceptions)
- Any random job which used to pass should still pass