Handle race condition in creating alerts
What does this MR do and why?
When multiple requests are POSTed to an alert integration around the same time, there's a chance for a race condition to occur with writing to the database.
This MR adds handling for the two possible errors from that race condition:
- Writing to the database fails due to the uniqueness constraint in postgres
- Writing to the database fails due to the uniqueness validation on the model
How to set up and validate locally
Unfortunately, neither of these errors can be easily triggered by modifying inputs. So the simplest way to see the behavior locally is hackily via pry
.
-
Add a debugger in
app/services/concerns/alert_management/alert_processing.rb
def process_new_alert return if resolving_alert? + binding.pry if alert.save
-
Trigger alert processing in the rails console
payload = { 'annotations' => { 'title' => 'TITLE' }, 'startsAt' => '2022-08-04T11:22:40Z' } project = Project.first AlertManagement::ProcessPrometheusAlertService.new(project, payload).execute
-
When the debugger pops up, create an alert which will cause validations to fail.
> alert.dup.save > continue
-
Navigate to
Monitor > Alerts
in the UI to see the most recent alert w/ 2 events
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to https://gitlab.com/gitlab-org/gitlab/-/issues/348676 (lightly)