Fix ActiveJob routing in ScheduledEnq
What does this MR do and why?
This MR fixes ActiveJob
routing in Gitlab::SidekiqSharding::ScheduledEnq
. The sharded sidekiq functionality is behind a feature flag.
More context on the problem: I noticed MailDeliveryJob
being routed to the wrong Redis after the feature flag has been fully enabled. The problem lies in https://github.com/rails/rails/blob/v7.0.8.1/activejob/lib/active_job/queue_adapters/sidekiq_adapter.rb#L31 where "class" => JobWrapper
works fine in a normal push but not in a scheduled push.
This breaks when the scheduled enq pops that hash out of the sorted set and sees that the class is not routable when it actually should be routable.
Follow-up from gitlab-com/gl-infra/scalability#2817 (comment 1845599786)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Start a local docker for Redis
docker run -p 6378:6379 -d redis:6.2-alpine
- Update config files
➜ gitlab git:(sc1-maildelivery-shard-awareness) ✗ cat config/redis.yml
---
development:
queues_shard_catchall_a:
url: "redis://localhost:6378"
Also update config/gitlab.yml
with:
## Sidekiq
sidekiq:
log_format: json # (default is also supported)
routing_rules:
- ["tags=needs_own_queue", null]
- ["*", "default", "queues_shard_catchall_a"]
- Enable feature flags using a rails console
Feature.enable(:enable_sidekiq_shard_router)
Feature.enable(:sidekiq_route_to_queues_shard_catchall_a)
- Run
gdk stop rails
as we want to observe where the jobs route to. It also helps to empty the queues and sorted sets.
gdk redis-cli -n 1 del queue:mailers
redis-cli -p 6378 del queue:mailers
redis-cli -p 6378 del schedule
- Open a rails console
SIDEKIQ_SHARD_NAME=queues_shard_catchall_a gdk rails c
and schedule 1 mail
[1] pry(main)> AbuseReportMailer.notify(1).deliver_later(wait: 20.seconds)
=> #<ActionMailer::MailDeliveryJob:0x000000017037e968
@arguments=["AbuseReportMailer", "notify", "deliver_now", {:args=>[1]}],
@exception_executions={},
@executions=0,
@job_id="4936f02e-64a6-41ad-842a-864713362255",
@priority=nil,
@provider_job_id="843054650159808d54a6fc12",
@queue_name="mailers",
@scheduled_at=1713444314.733443,
@successfully_enqueued=true,
@timezone="UTC">
- Verify that it is in the shard's Redis
➜ gitlab git:(sc1-fix-routing-scheduled-activejob) redis-cli -p 6378 zrange schedule 0 -1
1) "{\"retry\":3,\"queue\":\"mailers\",\"backtrace\":true,\"store\":\"queues_shard_catchall_a\",\"class\":\"ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper\",\"wrapped\":\"ActionMailer::MailDeliveryJob\",\"args\":[{\"job_class\":\"ActionMailer::MailDeliveryJob\",\"job_id\":\"b88c861c-37b8-48ea-adc1-460146ef6944\",\"provider_job_id\":null,\"queue_name\":\"mailers\",\"priority\":null,\"arguments\":[\"AbuseReportMailer\",\"notify\",\"deliver_now\",{\"args\":[1],\"_aj_ruby2_keywords\":[\"args\"]}],\"executions\":0,\"exception_executions\":{},\"locale\":\"en\",\"timezone\":\"UTC\",\"enqueued_at\":\"2024-04-18T12:45:39Z\"}],\"jid\":\"746f4d1fbd14fe6b20cf1008\",\"created_at\":1713444339.679032,\"correlation_id\":\"c1948abf0a00bbe54061f99533695932\",\"worker_data_consistency\":\"delayed\",\"wal_locations\":{},\"wal_location_source\":\"primary\",\"size_limiter\":\"validated\",\"scheduled_at\":1713444359.678166}"
- Using the gdk rails console form step 5
[4] pry(main)> se = Gitlab::SidekiqSharding::ScheduledEnq.new(Sidekiq.default_configuration)
[4] pry(main)> se.enqueue_jobs
=> ["retry", "schedule"]
- The job is routed to the same Redis's
queue:mailers
➜ gitlab git:(sc1-fix-routing-scheduled-activejob) redis-cli -p 6378 lrange queue:mailers 0 -1
1) "{\"retry\":3,\"queue\":\"mailers\",\"backtrace\":true,\"backtrace\":true,\"store\":\"queues_shard_catchall_a\",\"class\":\"ActiveJob::QueueAdapters::SidekiqAdapter::JobWrapper\",\"wrapped\":\"ActionMailer::MailDeliveryJob\",\"args\":[{\"job_class\":\"ActionMailer::MailDeliveryJob\",\"job_id\":\"b88c861c-37b8-48ea-adc1-460146ef6944\",\"provider_job_id\":null,\"queue_name\":\"mailers\",\"priority\":null,\"arguments\":[\"AbuseReportMailer\",\"notify\",\"deliver_now\",{\"args\":[1],\"_aj_ruby2_keywords\":[\"args\"]}],\"executions\":0,\"exception_executions\":{},\"locale\":\"en\",\"timezone\":\"UTC\",\"enqueued_at\":\"2024-04-18T12:45:39Z\"}],\"jid\":\"746f4d1fbd14fe6b20cf1008\",\"created_at\":1713444339.679032,\"correlation_id\":\"c1948abf0a00bbe54061f99533695932\",\"worker_data_consistency\":\"delayed\",\"wal_locations\":{},\"wal_location_source\":\"primary\",\"size_limiter\":\"validated\",\"scheduled_at\":1713444359.678166,\"idempotency_key\":\"resque:gitlab:duplicate:mailers:ef0f0775f1f7c4337c8ab568762510f154d9a7029d80ebb9aff14a07ba4b76dc\",\"enqueued_at\":1713444383.3223948}"
# empty for gdk's Redis
➜ gitlab git:(sc1-fix-routing-scheduled-activejob) gdk redis-cli -n 1 lrange queue:mailers 0 -1
(empty array)
- Repeating step 5-7 on the master branch will show a different result for step 8 where the job is sent to gdk's redis instead.