Skip to content

Patch Sidekiq::Scheduled::Enq to poll both namespaces

Sylvester Chin requested to merge sc1-patch-scheduled-enq into master

What does this MR do and why?

This MR introduces Gitlab::Patch::SidekiqScheduledEnq to patch Sidekiq::Scheduled::Enq's #enqueue_jobs. The patched method now polls the scheduled sets via Gitlab::Redis::Queues in addition to Sidekiq.redis (which may be namespaced). This is part of a wider effort to deprecate the use of namespaces gitlab-com/gl-infra&944 (closed)

By being able to poll both non-namespaced and namespaced sorted sets, we

  • eliminate reduce the need for script-based migration
  • able to gradually switch enqueues by deployments (us-east-1b, 1c, 1d, ...) without delaying scheduled jobs
  • rollback by switching enqueues back to the namespaced Redis without script-based migrations

See gitlab-com/gl-infra/scalability#2286 (closed)

The patch will be removed when the non-namespace feature is released.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Setup

Ensure env.runit in GDK does not have SIDEKIQ_POLL_NON_NAMESPACED or SIDEKIQ_ENQUEUE_NON_NAMESPACED env set.

Open 1 redis cli: gdk redis-cli -n 1 (db number may vary)

Open 3 gdk consoles

  1. SIDEKIQ_ENQUEUE_NON_NAMESPACED=true SIDEKIQ_POLL_NON_NAMESPACED=true gdk rails console: represents a non-namespaced client post migration that uses a non-namespaced Sidekiq.redis
  2. SIDEKIQ_POLL_NON_NAMESPACED=true gdk rails console: represents a dual-poller during migration
  3. gdk rails console: represents a namespaced client before migration

Run gdk stop rails to prevent rails-background-jobs from interfering with the validation

The 3 consoles would represent the migration phases and we validate that the dual-namespace scheduled poller would handle jobs in both namespaces regardless of how Sidekiq.redis was set up (with or without namespaces)

Validation

  1. Enqueue using console 1 (namespaced client)
[5] pry(main)> Chaos::SleepWorker.perform_in(1, 10)

On redis

redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange schedule 0 -1  WITHSCORES
(empty array)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange resque:gitlab:schedule 0 -1  WITHSCORES
1) "{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"version\":0,\"queue_namespace\":\"chaos\",\"class\":\"Chaos::SleepWorker\",\"args\":[11],\"jid\":\"ec0f9bceaec93d17398b28d3\",\"created_at\":1692241831.683188,\"correlation_id\":\"db77786f9ab8742ff561fd9ddd4ffc6b\",\"worker_data_consistency\":\"always\",\"size_limiter\":\"validated\",\"scheduled_at\":1692241832.6831412}"
2) "1692241832.6831412"
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> llen queue:default
(integer) 0
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> llen resque:gitlab:queue:default
(integer) 0
  1. Try scheduling with console 2 (dual poller)
[1] pry(main)> Sidekiq::Scheduled::Enq.new.enqueue_jobs
=> ["retry", "schedule"]

Verify that it is removed from the schedule set and send it to the namespaced queues

redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange resque:gitlab:schedule 0 -1  WITHSCORES
(empty array)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> llen resque:gitlab:queue:default
(integer) 1
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> llen queue:default
(integer) 0
  1. Schedule job with console 3 (non-namespaced client)
[5] pry(main)> Chaos::SleepWorker.perform_in(1, 12)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange resque:gitlab:schedule 0 -1  WITHSCORES
(empty array)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange schedule 0 -1  WITHSCORES
1) "{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"version\":0,\"queue_namespace\":\"chaos\",\"class\":\"Chaos::SleepWorker\",\"args\":[12],\"jid\":\"df2e79be203fcd7f335b7e66\",\"created_at\":1692241990.742708,\"correlation_id\":\"ed8588aa48809917cedfffbeff0423c8\",\"worker_data_consistency\":\"always\",\"size_limiter\":\"validated\",\"scheduled_at\":1692241991.74263}"
2) "1692241991.74263"
  1. Verify that console 1 would not schedule the delayed job
Sidekiq::Scheduled::Enq.new.enqueue_jobs
  1. Verify that console 2 (dual poller) would poll non-namespaced sorted sets send it to the namespaced queues
[2] pry(main)> Sidekiq::Scheduled::Enq.new.enqueue_jobs
=> ["retry", "schedule"]
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange schedule 0 -1  WITHSCORES
(empty array)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> llen resque:gitlab:queue:default
(integer) 2
  1. Repeat step 1 and schedule using console 3.
[1] pry(main)> Sidekiq::Scheduled::Enq.new.enqueue_jobs
=> ["retry", "schedule"]

After running .enqueue_jobs, Redis should show that the scheduled poller picks up from namespaced sorted sets and enqueue to non-namespaced queues

redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange resque:gitlab:schedule 0 -1  WITHSCORES
(empty array)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> llen queue:default
(integer) 1
  1. Repeat step 3 and schedule using console 3
[1] pry(main)> Sidekiq::Scheduled::Enq.new.enqueue_jobs
=> ["retry", "schedule"]

After running .enqueue_jobs, Redis should show that the scheduled poller picks up from non-namespaced sorted sets and enqueue to non-namespaced queues

redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> zrange schedule 0 -1  WITHSCORES
(empty array)
redis /Users/sylvesterchin/work/gitlab-development-kit/redis/redis.socket[1]> llen queue:default
(integer) 2

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Sylvester Chin

Merge request reports

Loading