Fix Runners heartbeat that can result in Runner being considered offline
What does this MR do?
This introduces two changes
Update ONLINE_CONTACT_TIMEOUT of Runner
This increases a timeout that Runner is considered online. This is due to fact of two aspects that impact how often we update DB entry:
-
Runner being terminated by Workhorse, thus waiting on queue notification
-
Rate of updating DB column
The timeout to consider Runner online (by DB) needs to be aligned with these two timeouts, otherwise runner can be wrongly assumed as not-online.
Any Runner originating request heartbeats Runner
Up to now Runner would be heartbeat if it would call jobs/request
.
However, in a case of a long running job the Runner might be
considered offline, where in fact it is processing data.
We should heartbeat Runner on every communication:
- requesting jobs
- updating status / trace / artifacts: this is being introduced here
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Links
- Resolves: gitlab-runner#3854 (closed)
- Related to: #19294
Edited by 🤖 GitLab Bot 🤖