Simplify master broken nudger by removing async delay
What does this MR do and why?
Address my feedback in gitlab-org/quality/engineering-productivity/team#215 (comment 1510176105)
Simplify the master broken nudger to ensure it pings the person closing the incident right away if the issue is missing root cause labels.
This automation has been broken, but I delayed fixing it because Engineering Productivity has been the main triage DRI and we are generally doing good with root cause analysis. Now since more groups are involved in triaging master broken incidents, we are seeing a lot of incidents getting closed without root cause label. I think we can simplify it by removing the async delay, especially knowing that most team members close the incident without using the /label
and /close
quick action so the chance of race condition is relatively low.
Expected impact & dry-runs
These are strongly recommended to assist reviewers and reduce the time to merge your change.
See https://gitlab.com/gitlab-org/quality/triage-ops/-/tree/master/doc/scheduled#testing-policies-with-a-dry-run on how to perform dry-runs for new policies.
See https://gitlab.com/gitlab-org/quality/triage-ops/-/blob/master/doc/reactive/best_practices.md#use-the-sandbox-to-test-new-processors on how to make sure a new processor can be tested.
Action items
-
If adding environment variables for reactive processors, update config/triage-web.yaml
and.gitlab/ci/triage-web.yml
-
(If applicable) Add documentation to the handbook pages for Triage Operations => - (If applicable) Identify the affected groups and how to communicate to them:
-
/cc @ person_or_group
=> -
Relevant Slack channels => -
Engineering week-in-review
-