chore: Limit MarkDroppedAsResolved lookup to primary_ids only
What does this MR do and why?
Previously, when primary_identifiers
were provided (only in the case of SAST),
we would resolve any identifier types that were resolved on
the default branch regardless of whether they were present within the
report itself.
This isn't an issue if a pipeline contains all scanners that return no primary identifiers.
This is also not an issue if a pipeline contains all scanners that return primary identifiers.
However, we could accidentally auto-resolve vulnerabilities which now co-exist in merged reports
which then contain a subset of the total identifiers
after merging multiple together.
See discovery and description of scenario in #368284 (comment 1220388069)
database changes
Vulnerability.by_primary_identifier_ids
Refactor join on This join is only used within MarkDroppedAsResolvedWorker
and joins unnecessarily across all identifiers when all we care about is the primary identifier.
Updated query plan https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14174/commands/49741
Reindex common_finder_query with resolved_on_default_branch
Drop previously added index with !100044 (merged) replacing with union of fields from index_vuln_reads_common_query_on_resolved_on_default_branch
.
❯ be rake db:migrate
main: == 20230104220137 ReindexVulnReadsOnDefaultBranchWithCommonQuery: migrating ===
main: -- transaction_open?()
main: -> 0.0001s
main: -- view_exists?(:postgres_partitions)
main: -> 0.1246s
main: -- index_exists?(:vulnerability_reads, [:project_id, :state, :report_type, :vulnerability_id], {:name=>"index_vuln_reads_common_query_on_resolved_on_default_branch", :where=>"resolved_on_default_branch IS TRUE", :order=>{:vulnerability_id=>:desc}, :algorithm=>:concurrently})
main: -> 0.0071s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0002s
main: -- add_index(:vulnerability_reads, [:project_id, :state, :report_type, :vulnerability_id], {:name=>"index_vuln_reads_common_query_on_resolved_on_default_branch", :where=>"resolved_on_default_branch IS TRUE", :order=>{:vulnerability_id=>:desc}, :algorithm=>:concurrently})
main: -> 0.0234s
main: -- execute("RESET statement_timeout")
main: -> 0.0004s
main: == 20230104220137 ReindexVulnReadsOnDefaultBranchWithCommonQuery: migrated (0.1652s)
main: == 20230104224020 DropVulnReadsOnDefaultBranchIndex: migrating ================
main: -- transaction_open?()
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0006s
main: -- indexes(:vulnerability_reads)
main: -> 0.0054s
main: -- remove_index(:vulnerability_reads, {:algorithm=>:concurrently, :name=>"index_vuln_reads_on_resolved_on_default_branch"})
main: -> 0.0027s
main: == 20230104224020 DropVulnReadsOnDefaultBranchIndex: migrated (0.0135s) =======
Index creation
CREATE INDEX index_vuln_reads_common_query_on_resolved_on_default_branch ON vulnerability_reads USING btree (project_id, state, report_type, vulnerability_id DESC) WHERE (resolved_on_default_branch IS TRUE);
Response:
The query has been executed. Duration: 3.862 min
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14246/commands/50001
Query utilizing new index
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14246/commands/50006
How to set up and validate locally
Feature.enabled(:sec_mark_dropped_findings_as_resolved)
- Run a pipeline containing non-overlapping vulnerabilities identifiable by both
semgrep
and another scanner (i.e.spotbugs
) - Remove the semgrep rule (but ensure semgrep's ruleset is not empty or auto-resolution will not occur)
- Run a new pipeline
- Ensure the
spotbugs
result is not auto-resolved
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.