"409 Conflict" causes Runner to not run any jobs, and give up checking for new jobs for half an hour
Status update: 2022-12-04
@stanhu spent quite a bit of time analyzing this issue. As a result we are proposing closing this issue in 14.6 pending any customer or community member feedback that they are still begin negatively impacted by the 409 conflict error.
Here are our findings to date:
-
409 Conflict errors are normal and expected in cases where you have multiple runners configured and trying to pick up the same job. If a 409 conflict occurs for this use case, then in our testing to date, we have not been able to reproduce the situation where the execution of jobs in the queue is delayed or not picked up for execution.
-
Note as stated in the detailed analysis below, the more concurrent Runners that are configured and that can request a job, the greater the probability of generating the 409 conflict error.
Workaround
-
In the extreme case where you're ALWAYS getting 409's once your runner is up and running, this could be caused by a "bad"
advertise_address
value for your[session_server]
section in the config.toml. If you're experiencing this and you have a[session_server]
settings configured (other than the default session_timeout) please comment them out and see if that allows the runner to pick up jobs.
What do I do I am still being impacted by these errors?
-
If you are noticing 409 conflict errors and the resultant behavior is that the the Runner is not reporting new jobs received, then this could be expected if there are no jobs pending in the queue for the runner.
-
If you are noticing 409 conflict errors and the resultant behavior is that there are pending jobs in the queue and the runner is not asking for them, then this is in fact a problem and will require further investigation.
For number 2, we will need data from your environment in order to debug your specific issue.
Original problem statement:
- A random
409 Conflict
for no good reason. It just stopped working. - After 4 tries, GitLab Runner totally gave up checking for new jobs for half an hour.
- During that "downtime", the runner was reported as active in GitLab UI - "Last contact" read 1/2 minutes all the time.
Jun 13 09:49:43 runner-2.xxx gitlab-runner[46627]: Checking for jobs... received job=528753 repo_url=https://xxx.com/yyy/zzz.git runner=d5abe3f5
Jun 13 09:49:43 runner-2.xxx gitlab-runner[46627]: Checking for jobs... received job=528753 repo_url=https://xxx.com/yyy/zzz.git runner=d5abe3f5
Jun 13 09:50:00 runner-2.xxx gitlab-runner[46627]: Job succeeded duration=17.459582283s job=528753 project=8 runner=d5abe3f5
Jun 13 09:50:00 runner-2.xxx gitlab-runner[46627]: Job succeeded duration=17.459582283s job=528753 project=8 runner=d5abe3f5
Jun 13 09:52:13 runner-2.xxx gitlab-runner[46627]: WARNING: Checking for jobs... failed runner=d5abe3f5 status=409 Conflict
Jun 13 09:52:13 runner-2.xxx gitlab-runner[46627]: WARNING: Checking for jobs... failed runner=d5abe3f5 status=409 Conflict
Jun 13 09:52:25 runner-2.xxx gitlab-runner[46627]: WARNING: Checking for jobs... failed runner=d5abe3f5 status=409 Conflict
Jun 13 09:52:25 runner-2.xxx gitlab-runner[46627]: WARNING: Checking for jobs... failed runner=d5abe3f5 status=409 Conflict
Jun 13 10:19:56 runner-2.xxx gitlab-runner[46627]: Checking for jobs... received job=528783 repo_url=https://xxx.com/aaa/bbb.git runner=d5abe3f5
Jun 13 10:19:56 runner-2.xxx gitlab-runner[46627]: Checking for jobs... received job=528783 repo_url=https://xxx.com/aaa/bbb.git runner=d5abe3f5
root@gitlab-ci-runner-bm-2:~# date
Thu Jun 13 10:20:31 PDT 2019
Environment description
config.toml contents
concurrent = 10
check_interval = 1
[session_server]
session_timeout = 1800
[[runners]]
name = "xxx"
url = "https://xxx"
token = "xxx"
executor = "docker"
[runners.docker]
tls_verify = false
image = "alpine:latest"
privileged = true
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
shm_size = 3221225472
[runners.cache]
Used GitLab Runner version
root@gitlab-ci-runner-bm-2:~# gitlab-runner --version
Version: 11.11.2
Git revision: ac2a293c
Git branch:
GO version: go1.8.7
Built: 2019-06-03T10:57:49+0000
OS/Arch: linux/amd64
root 46627 9.0 0.0 70056 35948 ? Ssl Jun03 1321:53 /usr/lib/gitlab-runner/gitlab-runner run --working-directory /home/gitlab-runner --config /etc/gitlab-runner/config.toml --service gitlab-runner --syslog --user gitlab-runner
I don't have the log output indicating this is the version currently running but given the build date (2019-06-03T10:57:49+0000), and process age (Jun03), we can conclude it's the version currently running.
Possible fixes
- Don't throw 409 Conflict at me.
- Don't give up for half an hour when faced with 409 Conflict.
See comment: #4360 (comment 199648120)
The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.