Gracefully track errors raised by sending CI minutes notifications
What does this MR do and why?
Related to #331785 (closed)
Also related to https://gitlab.com/gitlab-org/gitlab/-/issues/335885 (but this represents a quick fix for now)
While testing the fix for #331785 (closed) I noticed that a Sidekiq job was failing due to a missing project
record in Ci::Minutes::EmailNotificationService
, causing the job to be retried numerous times with backoff. Since the error was consistent, the backoff + the number of retries caused the job to keep retrying after 12 hours (which is the TTL used for idempotency). This means that after 12 hours UpdateProjectAndNamespaceUsageService
loses the idempotency.
Given that UpdateProjectAndNamespaceUsageService
is required to be fully idempotent, we want it to be as lean as possible. As a quick fix I've wrapped the method responsible for calculating and sending the email notification to the user with a rescue block and tracking the exception in Sentry so we don't lose visibility.
As this is also related to https://gitlab.com/gitlab-org/gitlab/-/issues/335885#note_696254385, I've left a TODO linked to that issue so we can extract the email notification into a dedicated worker, which doesn't impact the idempotency of UpdateProjectAndNamespaceUsageService
.
Screenshots or screen recordings
These are strongly recommended to assist reviewers and reduce the time to merge your change.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.