Agent vulnerability resolution doesn't account for multiple workloads / agents

Summary

Agent vulnerability resolution works by submitting a list of Vulnerability UUIDs which were identified in a scan, and resolving all the active cluster image scanning vulnerabilities which have UUIDs that do not appear in the list. This does not behave as expected when there are multiple workloads, because each workload is scanned in a separate goroutine, and each goroutine submits UUIDs to internal/kubernetes/modules/starboard_vulnerability/scan_result individually. When we scan more than one workload, we encounter this conflict:

Workload A is scanned
Vulnerabilities for workload A are created in GitLab
Vulnerabilities not present in the scan for workload A are resolved
Workload B is scanned
Vulnerabilities for workload B are created in GitLab
Vulnerabilities not present in the scan for workload B are resolved
1. This resolves all the vulnerabilities from the scan of Workload A, even if they are still present.

Attempting to use scanning with multiple agents in one project will result in a similar problem. When scanning with two agents A and B, agent B will mark all of the vulnerabilities detected by agent A as resolved upon completion.

Steps to reproduce

Ensure your GDK installation runs KAS from master:

# from GDK root
echo master > gitlab/GITLAB_KAS_VERSION
make gitlab-k8s-agent-update-run
gdk restart gitlab-k8s-agent

Create a new local project.
Connect an Agent in an existing cluster.
Tunnel your local KAS to make it reachable from within the cluster (I used ngrok for exmaple). Patch the deployment's --kas-address to point to the tunneled KAS.

Create two deployments, e.g.:

kubectl create deployment ubuntu --image ubuntu:18.04
kubectl create deployment nginx --image nginx:1.20.0

Navigate to the project's "Operational vulnerabilities" tab in the Security Report. Filter for "Resolved" with the status dropdown and find that all vulnerabilities have been resolved:

Example Project

What is the current bug behavior?

What is the expected correct behavior?

Relevant logs and/or screenshots

Possible fixes

Change how agentk does vulnerability resolution:
1. As scans are running, create a consolidated list of vulnerability UUIDs from all the different scan goroutines
2. Once all the scans have been completed, submit at scan_result request with the UUIDs from all workloads
3. Another option would be do a composite query on the rails end for undetected vulnerabilities, where we also search for the location UUID. However, this will not resolve vulnerabilities for workloads which no longer exist, so I believe that this method is better.
In order to support multiple agents, we also need to update StarboardVulnerabilityResolveService so that it queries for undetected findings by agent_id.

Implementation plan

This MR (Fix resolving cluster image scanning vulnerabil... (!91121 - merged)) was created to verify this Implementation plan, you can verify if this is working in our case.

backend modify Transmit function in starboard_vulnerability/agent/reporter.go (https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/internal/module/starboard_vulnerability/agent/reporter.go#L36) to return additionally uuids and make resolveVulnerabilities function as a public function,
backend modify scan function in starboard_vulnerability/agent/scanner.go (https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/master/internal/module/starboard_vulnerability/agent/scanner.go#L102) to collect uuids from all reports and then send to API it using ResolveVulnerability function,
backend add scope :with_findings_for_agent_id to ee/app/models/ee/vulnerability.rb,
backend extend undetected method in ee/app/services/vulnerabilities/starboard_vulnerability_resolve_service.rb to include vulnerabilities only with_findings_for_agent_id,

Edited Jul 15, 2022 by Alan (Maciej) Paruszewski