Reconsider `urgency` of `Llm::CompletionWorker`
We're seeing consistent SidekiqServiceWorkerExecutionApdexSLOViolation
alerts for Llm::CompletionWorker
. If we look at its Apdex score, it is indeed missing its target, reflecting that the Sidekiq jobs take longer than expected. However, Completion requests are expected to take longer than other types of Sidekiq jobs, because LLMs need a bit of time to reply, but we have tagged the worker as having high urgency (https://gitlab.com/gitlab-org/gitlab/-/blob/b30e39eb822ab4331fc143439647b18552b30c03/ee/app/workers/llm/completion_worker.rb#L10), which sets its execution target at 10 seconds (https://docs.gitlab.com/ee/development/sidekiq/worker_attributes.html#job-urgency). We should do the following:
- Establish from the metrics what the average execution time of completion jobs is
- Determine if we need to lower the
urgency
of the worker (while following our documentation for that change)
See also: https://docs.gitlab.com/ee/development/sidekiq/worker_attributes.html#latency-sensitive-jobs
Related MR: !150276 (merged)
/cc @oregand