Accept monitoring event as a new object kind, update specs
What does this MR do and why?
This is to address one of the corrective actions from #1352 (closed), specifically
Add holistic health checks/monitoring in GCP to ensure that we know 100% when triage-ops is working or not log user agent or have some way of uniquely identifying the uptime checks requests from regular requests
This MR accepts monitoring
as a new object kind, in addition to issue
, incident
, MR
, and pipeline
.
Adds a processor to respond to the uptime check events as a monitoring
object, and respond with a logging to indicate that triage-ops is fully funcitonal.
This must be merged before https://gitlab.com/gitlab-org/quality/engineering-productivity-infrastructure/-/merge_requests/391
related: #1352 (closed).
Expected impact & dry-runs
These are strongly recommended to assist reviewers and reduce the time to merge your change.
See https://gitlab.com/gitlab-org/quality/triage-ops/-/tree/master/doc/scheduled#testing-policies-with-a-dry-run on how to perform dry-runs for new policies.
See https://gitlab.com/gitlab-org/quality/triage-ops/-/blob/master/doc/reactive/best_practices.md#use-the-sandbox-to-test-new-processors on how to make sure a new processor can be tested.
Action items
-
If adding environment variables for reactive processors, update config/triage-web.yaml
and.gitlab/ci/triage-web.yml
-
(If applicable) Add documentation to the handbook pages for Triage Operations => - (If applicable) Identify the affected groups and how to communicate to them:
-
/cc @ person_or_group
=> -
Relevant Slack channels => -
Engineering week-in-review
-