Mark duplicate jobs in Sidekiq
What does this MR do?
This marks a job as duplicate when there was already a job in the same queue with the same arguments.
We do this by storing a key in Redis based on the argumens, worker and queue when a job gets scheduled. If another job gets scheduled and the key already exists, we mark the job as duplicate.
When a job starts, we delete its key from Redis.
Later we can use this for dropping jobs from redis if they are idempotent.
This makes the logs look like this, notice the duplicate
field in there :smile::
{"severity":"INFO","time":"2020-02-18T15:50:41.965Z","class":"ProjectImportScheduleWorker","args":[3],"retry":false,"queue":"project_import_schedule","backtrace":true,"jid":"ff74c37e2c277f964a7c0a93","created_at":"2020-02-18T15:45:23.752Z","enqueued_at":"2020-02-18T15:45:23.791Z","meta.project":"gnuwget/wget2","meta.root_namespace":"gnuwget","meta.subscription_plan":"default","correlation_id":"802df09dd7b7f3aba73bc5b5ac406b6e","duplicate":"false","pid":6633,"message":"ProjectImportScheduleWorker JID-ff74c37e2c277f964a7c0a93: done: 4.911097 sec","job_status":"done","scheduling_latency_s":313.263012,"duration":4.911097,"cpu_s":0.609397,"completed_at":"2020-02-18T15:50:41.965Z","db_duration":2.305000089108944,"db_duration_s":0.002305000089108944}
{"severity":"INFO","time":"2020-02-18T15:50:41.973Z","class":"ProjectImportScheduleWorker","args":[3],"retry":false,"queue":"project_import_schedule","backtrace":true,"jid":"7df42b5ef1dc22fdd08f41de","created_at":"2020-02-18T15:45:36.956Z","enqueued_at":"2020-02-18T15:45:36.963Z","meta.project":"gnuwget/wget2","meta.root_namespace":"gnuwget","meta.subscription_plan":"default","correlation_id":"27bff2365404c7ce3f4db2fc52e61270","duplicate":"true","pid":6633,"message":"ProjectImportScheduleWorker JID-7df42b5ef1dc22fdd08f41de: start","job_status":"start","scheduling_latency_s":305.010123}
gitlab-com/gl-infra/scalability#165 (closed)
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides - [-] Database guides
-
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Edited by Bob Van Landuyt