Defer Sidekiq jobs from worker type feature flags
What does this MR do and why?
Resolves gitlab-com/gl-infra/scalability#2346 (closed)
Check out gitlab-com/gl-infra&1004 (closed) for more context
The new server middleware will check against feature flag of worker
type defer_sidekiq_jobs:<worker_name>
to determine whether
the job should be deferred.
Changelog: added
Note: This MR is based off !120590 (merged)
How to set up and validate locally
-
Change the
DELAY
constant inlib/gitlab/sidekiq_middleware/defer_jobs.rb
to 1 minute (or shorter) so we don't have to wait for 5 minutes. -
Restart sidekiq
$ gdk restart rails-background-jobs ok: down: /Users/gregoriusmarco/Documents/workspace/gdk-10-22/services/rails-background-jobs: 0s ok: run: /Users/gregoriusmarco/Documents/workspace/gdk-10-22/services/rails-background-jobs: (pid 73634) 1s, normally down
-
In Rails console, start a job:
[6] pry(main)> Chaos::SleepWorker.perform_async(1) => "51b6c3e837b643eaddb442a7" [7] pry(main)> Chaos::SleepWorker.queue => "default"
-
Check in Redis that the job was performed immediately:
redis /Users/gregoriusmarco/Documents/workspace/gdk-10-22/redis/redis.socket[1]> llen resque:gitlab:queue:default (integer) 0
-
Turn on the feature flag
defer_sidekiq_jobs:Chaos::SleepWorker
via API (which requires to setup a PAT), or directly inserting tofeature_gates
table in DB:$ curl -H "PRIVATE-TOKEN: $PERSONAL_ACCESS_TOKEN" http://gdk.test:3000/api/v4/features/defer_sidekiq_jobs:Chaos::SleepWorker --data "value=true" | jq { "name": "defer_sidekiq_jobs:Chaos::SleepWorker", "state": "on", "gates": [ { "key": "boolean", "value": true } ], "definition": null }
-
Try running the job again:
# Clear the current ScheduledSet jobs [11] pry(main)> Sidekiq::ScheduledSet.new.clear => true [9] pry(main)> Chaos::SleepWorker.perform_async(1) => "d8142280ffcb4cb0ba55ed62"
-
Check for the scheduled job in Redis. Check the
scheduled_at
will be in 1 minute. (Thejid
will be different in this case, as the middleware is effectively enqueueing a new job). After another minute, thescheduled_at
will be updated to the subsequent minute.redis /Users/gregoriusmarco/Documents/workspace/gdk-10-22/redis/redis.socket[1]> zrevrange resque:gitlab:schedule 0 -1 1) "{\"retry\":3,\"queue\":\"default\",\"backtrace\":true,\"version\":0,\"queue_namespace\":\"chaos\",\"class\":\"Chaos::SleepWorker\",\"args\":[1],\"jid\":\"d40984ff09e83eae15ebdd53\",\"created_at\":1684149507.791332,\"correlation_id\":\"a722a2114851f18f7d501e83eab9742a\",\"meta.caller_id\":\"Chaos::SleepWorker\",\"meta.feature_category\":\"not_owned\",\"meta.root_caller_id\":\"Chaos::SleepWorker\",\"worker_data_consistency\":\"always\",\"size_limiter\":\"validated\",\"scheduled_at\":1684149567.791281}"
-
Turn off the feature flag:
$ curl -H "PRIVATE-TOKEN: $PERSONAL_ACCESS_TOKEN" http://gdk.test:3000/api/v4/features/defer_sidekiq_jobs:Chaos::SleepWorker --data "value=false" | jq { "name": "defer_sidekiq_jobs:Chaos::SleepWorker", "state": "off", "gates": [ { "key": "boolean", "value": false } ], "definition": null }
-
Wait for the feature flag's thread local cache to expire (should be within a minute), then check the ScheduledSet in redis again which should be empty:
redis /Users/gregoriusmarco/Documents/workspace/gdk-10-22/redis/redis.socket[1]> zrevrange resque:gitlab:schedule 0 -1 (empty array)
percentage of time
To test -
Turn on the feature flag with integer value:
$ curl -H "PRIVATE-TOKEN: $PERSONAL_ACCESS_TOKEN" http://gdk.test:3000/api/v4/features/defer_sidekiq_jobs:Chaos::SleepWorker --data "value=10" | jq { "name": "defer_sidekiq_jobs:Chaos::SleepWorker", "state": "conditional", "gates": [ { "key": "boolean", "value": false }, { "key": "percentage_of_time", "value": 10 } ], "definition": null }
-
Check using
Feature.enabled?
:[15] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false [16] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false [17] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false [18] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => true [19] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false [20] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => true [21] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false [22] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false [23] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false [24] pry(main)> Feature.enabled?(:"defer_sidekiq_jobs:Chaos::SleepWorker", type: :worker, default_enabled_if_undefined: false) => false
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.