Improve logs of the Sidekiq reliable fetcher
🔭 What does this MR do and why?
When sidekiq is shutdown (as it was explained to me here a few months ago, not sure if the situation is still the same), sometimes, we will try to re-enqueue jobs into their queues using this method.
As you can see the method will create a log entry. The problem is that, this entry contains not a lot of elements to identify the job. We do have the jid
and queue
but that doesn't help a lot.
The problem is that it's not exactly clear which job (class) has been re enqueued. Let's say that we want to run an analysis on these logs to know which job get interrupted the most of the time. It's impossible.
This MR proposes to add the class
field to the log entry, increasing the observability of this aspect.
🦄 Screenshots or screen recordings
Well, sidekiq, background jobs... Not really related to screens
⚙ How to set up and validate locally
(credits to this comment)
- Update
config/initializers/sidekiq.rb
to force a higher rate of jobs clean up
diff --git a/config/initializers/sidekiq.rb b/config/initializers/sidekiq.rb
index 8df12671f265..6a6dabf911ab 100644
--- a/config/initializers/sidekiq.rb
+++ b/config/initializers/sidekiq.rb
@@ -35,6 +35,8 @@ def enable_semi_reliable_fetch_mode?
Sidekiq.configure_server do |config|
config[:strict] = false
config[:queues] = Gitlab::SidekiqConfig.expand_queues(config[:queues])
+ config[:lease_interval] = 5
+ config[:cleanup_interval] = 10
-
Restart the sidekiq process
gdk restart rails-background-jobs
-
I navigated to http://localhost:3000/admin/sidekiq/cron and clicked
Disable All
. This will silence the logs. -
Run
gdk tail rails-background-jobs
to tail logs. -
Enqueue a job using
gdk rails c
, I enqueued a harmless long-running job
[1] pry(main)> Chaos::SleepWorker.perform_async(240)
=> "a382beb0da31c0e9e0e9db6b"
- Use
gdk redis-cli -n 1
and delete the heartbeat so that the cleanup process will think the job is dead.
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> keys *reliable-fetcher-heartbeat-*
1) "reliable-fetcher-heartbeat-SylvestersMBP2.localdomain-67195-4d63cd867499"
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> del reliable-fetcher-heartbeat-SylvestersMBP2.localdomain-67195-4d63cd867499
(integer) 1
- Observe log message:
2023-12-18_02:19:30.50192 rails-background-jobs : {"severity":"INFO","time":"2023-12-18T02:19:30.501Z","message":"Cleaning working queues"}
2023-12-18_02:19:30.50598 rails-background-jobs : {"severity":"INFO","time":"2023-12-18T02:19:30.505Z","message":"Pushed job 794082f4a1847cea9b253ef1 back to queue queue:default","jid":"794082f4a1847cea9b253ef1","class":"Chaos::SleepWorker","queue":"queue:default","retry":0}
🏎 MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.