Allow sidekiq deployment to have differing timeouts
Summary
Currently the default timeout for a sidekiq process when shutting down is limited to being configured globally across all sidekiq deployments. The default value for this is 5 seconds
GitLab.com would like to adjust the sidekiq timeout above the default of 5 seconds to a value higher for specific types of workloads. Utilize this issue to enable individual configuration of the SIDEKIQ_TIMEOUT
on a per pod or per Deployment basis.
Details
Currently there is no protection if a user sets the timeout to a value higher than the terminationGracePeriodSeconds
. (See also #2557 (closed)) This results in jobs that are continuing to run beyond 30 seconds, will be forcibly stopped. While GitLab uses sidekiq reliable fetcher to protect from jobs that are lost, this makes for a poor user experience.
We also run into situations where jobs that typically take longer than the default of 5 seconds to run, may be requeued, only to be POTENTIALLY picked up by a Pod that is soon to meet the same fate, and that job may again be requeued. This again leads to a poor user experience.
Actionable
-
Ensure timeout is applicable on a per pods
entry basis. (see comment) -
Complete #2557 (closed), ensuring that a check to confirm the applicable timeout
value is lower than theterminationGracePeriodSecond
.