Consider vulnerability added_at field for SLOs
Context
Closes https://gitlab.com/gitlab-org/quality/triage-ops/-/issues/1402
What does this MR do and why?
When computing the breach_date
, we take the datetime at which a Vulnerability::Vendor PackageFix Available or Vulnerability::Vendor Base ContainerFix Available label was added instead of the datetime at which the priority label was added.
Proof of work
Setup
I had to test this one locally, as I didn't want to wait for all conditions to be met in a test project (i.e. add a severity label, wait exactly 60 days to see the results
diff --git a/lib/constants/slo_targets.rb b/lib/constants/slo_targets.rb
index b81d9509..f5733713 100644
--- a/lib/constants/slo_targets.rb
+++ b/lib/constants/slo_targets.rb
@@ -11,7 +11,7 @@ module SloTargets
VULNERABILITY_SEVERITY_SLO_TARGETS = {
"severity::1" => 30,
"severity::2" => 30,
- "severity::3" => 90,
+ "severity::3" => 10, # Time we have to fix a `severity::3` issue
"severity::4" => 180
}.freeze
end
diff --git a/lib/slo_breach_helper.rb b/lib/slo_breach_helper.rb
index 18cfb779..ba9854d7 100644
--- a/lib/slo_breach_helper.rb
+++ b/lib/slo_breach_helper.rb
@@ -160,7 +160,7 @@ module SloBreachHelper
def severity_label_added_date
return nil unless severity_label_with_details&.added_at
- severity_label_with_details.added_at.to_date
+ (severity_label_with_details.added_at.to_date - 7) # To do as if we added the severity label 7 days ago
end
# In case we have many vulnerability_fix_available labels on the resource,
diff --git a/policies/groups/gitlab-org/hygiene/comment-vulnerability-issue-slo.yml b/policies/groups/gitlab-org/hygiene/comment-vulnerability-issue-slo.yml
index ba1911cc..8497d3cb 100644
--- a/policies/groups/gitlab-org/hygiene/comment-vulnerability-issue-slo.yml
+++ b/policies/groups/gitlab-org/hygiene/comment-vulnerability-issue-slo.yml
@@ -35,7 +35,7 @@ resource_rules:
- SLA::Near Breach
- SLA::Breached
ruby: |
- days_til_breach == 14
+ days_til_breach < 14 # To avoid doing math: I want to see any issue that will breach within 14 days
actions:
labels:
- SLA::Near Breach
And I'm using this issue as the testing issue.
Here's the command I ran locally:
$ cd ~/src/triage-ops
export TRIAGE_POLICY_FILE=policies/groups/gitlab-org/hygiene/comment-vulnerability-issue-slo.yml
export GITLAB_COM_API_TOKEN="${GITLAB_API_PRIVATE_TOKEN}"
export TRIAGE_SOURCE_TYPE=project
export TRIAGE_SOURCE_PATH=36072369 # https://gitlab.com/gitlab-org/quality/engineering-productivity/triage-ops-playground
export EXTRA_FLAGS="--dry-run"
gitlab-triage -r ./plugins/all --debug -f $TRIAGE_POLICY_FILE --token $GITLAB_COM_API_TOKEN --source $TRIAGE_SOURCE_TYPE --source-id $TRIAGE_SOURCE_PATH $EXTRA_FLAGS
Results
severity3 + Vulnerability::Vendor PackageFix Unavailable label
Scenario 1:~"" Engineering Manager ~"" Product Manager
This ~"FedRAMP::Vulnerability" ~"severity::3" issue is approaching its remediation SLO.
Consider taking action before this gets labeled as ~"SLA::Breached" in 3 days (2023-10-20).
/label ~"SLA::Near Breach" ~"SLO::Near Miss"
Since the severity3 label was added 7 days ago (thanks to the testing setup), it's expected we get 10 - 7 = 3 days to fix it
severity3 + Vulnerability::Vendor PackageFix Available label
Scenario 2:~"" Engineering Manager ~"" Product Manager
This ~"FedRAMP::Vulnerability" ~"severity::3" issue is approaching its remediation SLO.
Consider taking action before this gets labeled as ~"SLA::Breached" in 10 days (2023-10-27).
/label ~"SLA::Near Breach" ~"SLO::Near Miss"
Since we just added the Vulnerability::Vendor PackageFix Available label to the issue, it's expected we get 10 days to fix it (the new default for severity3 in the testing setup)
Action items
-
If adding environment variables for reactive processors, update config/triage-web.yaml
and.gitlab/ci/triage-web.yml
-
(If applicable) Add documentation to the handbook pages for Triage Operations => - (If applicable) Identify the affected groups and how to communicate to them:
-
/cc @ person_or_group
=> -
Relevant Slack channels => -
Engineering week-in-review
-