Fix ScheduledEnq to use job class's queue for push
What does this MR do and why?
Fix ScheduledEnq to use job class's queue for push
This addresses misrouted jobs when changing a worker attribute. During the deployment period where both new and old versions are running, jobs scheduled by the new processes could be picked up by an older process and routed with a wrong combination of store and queue.
See gitlab-com/gl-infra/scalability#3497 (comment 2007088595)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
Set up
- GDK but with sidekiq disabled (
gdk stop rails-background-jobs
) - Another redis:
docker run -p 6378:6379 -d redis:6.2-alpine
and configureconfig/redis.yml
with:
➜ cat config/redis.yml
---
development:
queues_shard_catchall_a:
url: "redis://localhost:6378"
- Prepare
config/gitlab.yml
with the following sidekiq routing rules
production: &base
sidekiq:
log_format: json # (default is also supported)
routing_rules:
- ["tags=needs_own_queue", null]
- ["*", "default"]
For each branch (master and this):
- Start up a Sidekiq process with the
OLD
config
SIDEKIQ_SHARD_NAME=queues_shard_catchall_a bundle exec ruby bin/sidekiq-cluster default,chaos -e development --timeout 10
- Update
config/gitlab.yml
toNEW
with
production: &base
sidekiq:
log_format: json # (default is also supported)
routing_rules:
- ["tags=needs_own_queue", null]
- ["worker_name=Chaos::CpuSpinWorker", "chaos", "queues_shard_catchall_a"]
- ["*", "default"]
- Open a gdk console and schedule a
CpuSpinWorker
> gdk rails c
[1] pry(main)> Chaos::CpuSpinWorker.perform_in(1, 1)
=> "06c944d35e8382cffffb47da"
- Check
redis-cli
. Note: you have to rungdk stop rails-background-jobs
, or the jobs will be consumed.
On the master branch, the queue:chaos
will have 1 job enqueued since the "old" process pushes to the new queue and old store.
➜ gitlab git:(master) ✗ gdk redis-cli -n 1 lrange queue:chaos 0 -1
1) "{\"retry\":3,\"queue\":\"chaos\",\"backtrace\":true,\"backtrace\":true,\"version\":0,\"store\":\"queues_shard_catchall_a\",\"queue_namespace\":\"chaos\",\"class\":\"Chaos::CpuSpinWorker\",\"args\":[1],\"jid\":\"384c47e1ea3274d1427df170\",\"created_at\":1721798278.708093,\"trace_propagation_headers\":{\"sentry-trace\":\"ff8e1d78c8504f2eaa6ff627fef033ac-c816f994e58e424b\",\"baggage\":\"sentry-trace_id=ff8e1d78c8504f2eaa6ff627fef033ac,sentry-environment=,sentry-release=18266f1dccb\"},\"meta.sidekiq_destination_shard_redis\":\"queues_shard_catchall_a\",\"correlation_id\":\"727ef2bc380d45cd5090d26550ad6c1d\",\"worker_data_consistency\":\"always\",\"size_limiter\":\"validated\",\"scheduled_at\":1721798279.708069,\"idempotency_key\":\"resque:gitlab:duplicate:chaos:22a5e8e9d30eac0fa7a7ff37a2624dcb6b65842c19798591fa0cc57d26805043\",\"enqueued_at\":1721798281.151815}"
On the feature branch, the queue:default
will have 1 job enqueued since the "old" process pushes to the old queue and old store.
➜ gitlab git:(sc1-fix-sidekiq-routing-transition) ✗ gdk redis-cli -n 1 lrange queue:default 0 -1
1) "
{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"backtrace\":true,\"version\":0,\"store\":\"queues_shard_catchall_a\",\"queue_namespace\":\"chaos\",\"class\":\"Chaos::CpuSpinWorker\",\"args\":[1],\"jid\":\"06c944d35e8382cffffb47da\",\"created_at\":1721798508.2754,\"trace_propagation_headers\":{\"sentry-trace\":\"df43a6f0ba6a4536b9a2b03e259ef662-ea4c0a62e2734b50\",\"baggage\":\"sentry-trace_id=df43a6f0ba6a4536b9a2b03e259ef662,sentry-environment=,sentry-release=cec6f423b76\"},\"meta.sidekiq_destination_shard_redis\":\"queues_shard_catchall_a\",\"correlation_id\":\"0522bc67336e917ff8917652d5e3064d\",\"worker_data_consistency\":\"always\",\"size_limiter\":\"validated\",\"scheduled_at\":1721798509.275357,\"idempotency_key\":\"resque:gitlab:duplicate:default:22a5e8e9d30eac0fa7a7ff37a2624dcb6b65842c19798591fa0cc57d26805043\",\"enqueued_at\":1721798511.479434}"