Stage play manual jobs may randomly leave some jobs in skipped state
Summary
When using manual jobs with a needs based DAG flow, the use of the stage-level play button can cause improper transitions and leave some further jobs in a skipped state.
Steps to reproduce
- Add the following pipeline definition in any CI/CD enabled project
Click to expand `.gitlab-ci.yml`
image: bash:latest
stages:
- first-auto
- second-manual
- third-auto
- fourth-manual
- fifth-manual
first-auto:
stage: first-auto
script: echo
second-manual-job-one:
stage: second-manual
script: echo
needs:
job: first-auto
when: manual
second-manual-job-two:
stage: second-manual
script: echo
needs:
job: first-auto
when: manual
third-auto-job-one:
stage: third-auto
script: echo
needs:
job: second-manual-job-one
third-auto-job-two:
stage: third-auto
script: echo
needs:
job: second-manual-job-two
fourth-manual-job-one:
stage: fourth-manual
script: echo
needs:
- job: second-manual-job-one
- job: third-auto-job-one
when: manual
fourth-manual-job-two:
stage: fourth-manual
script: echo
needs:
- job: second-manual-job-two
- job: third-auto-job-two
when: manual
fifth-manual:
stage: fifth-manual
script: echo
needs:
- job: first-auto
- job: fourth-manual-job-one
- job: fourth-manual-job-two
when: manual
-
Visit the created pipeline
-
Await completion of "first-auto" stage job
-
Use the stage-level play icon (not per-job level) to run both jobs of "second-manual" stage
-
Observe (this may need a few retries of 2-4) that even after completion of these jobs, some jobs remain in a skipped state in the "third-auto" and "fourth-manual" stages.
Note: Sometimes this occurs only after the stage play button is clicked over the "fourth-manual" job if the randomness has not affected any prior stage jobs.
Also note: This behavior does not reproduce if you use the per-manual-job play buttons instead of the stage-level ones.
Example Project
Example pipeline where one of the two 'fourth-manual' stage jobs remained skipped and would not auto transition: https://gitlab.com/gitlab-gold/hchouraria/sample-ci/-/pipelines/375727600
Example pipeline where one of the two 'third-auto' stage jobs remained skipped and would not auto-transition: https://gitlab.com/gitlab-gold/hchouraria/sample-ci/-/pipelines/375730159
What is the current bug behavior?
Some of the subsequent stage's jobs are randomly left skipped after the stage level play button is used.
What is the expected correct behavior?
Stage level play button must behave the same as job level play buttons and transition states of next stage jobs correctly every time.
Relevant logs and/or screenshots
Unexpected skipped states that occur at any stage when the stage buttons are used:
Output of checks
This bug happens on GitLab.com
It was also reported occurring on GitLab 13.12 by a premium customer over support ticket: https://gitlab.zendesk.com/agent/tickets/236412 (internal link)
Possible fixes
TBD
Workarounds
- Avoid the stage-level play button and instead fire the manual jobs directly via per job play button
- If jobs appear stuck in a similar way as screenshots above, retry/re-run its prior stage job