WIP: Add grype scanner
Implements gitlab-org/gitlab#326279 (closed)
Overview
The existing analyzer code uses a single scanner — Trivy. The objective of this MR is to add a second scanner — for Grype — and to introduce a preliminary scanner-generalization strategy to the analyzer code such that scanners can be selected at runtime, and new scanners can be fitted into this analyzer with minimal effort.
The scanner (grype)
The Grype core team has added new functionality to Grype in order to facilitate a better integration with GitLab's container scanning analyzer:
- Pulling images directly from a container registry (rather than rely on the Docker daemon)
- Processing user-provided templates to control the output data format (to enable creating GitLab-specific Grype template(s))
These features are now released in Grype v0.10.2.
The analyzer (container-scanning)
Some context: I'm new to Ruby — please critique without restraint!
What this MR does so far:
- Creates a new
Dockerfile-grype
(let's find a better name!) for building container images that include the Grype binary and its database file (suitable for offline environments) - Adds a new
--scanner
switch togtcs
, so the caller can select "trivy" or "grype". "Trivy" is selected by default (note: this can be changed from a CLI option to an env var if needed) - Adds a new Grype template to generate the output required by GitLab
- Removes a few references to "Trivy" where we'll now operate on a generalized/abstracted "scanner" (such as in the logger formatter)
What's left to do:
- Add tests for the newly added Grype code paths
- Map analyzer to user-facing "container scanning" feature (e.g. config variables) and document how to select Grype instead of Trivy — I could use a brief primer on this part of the architecture
😃 - Add process for building and pushing Grype-specific analyzer container image to GitLab's CI workflow — we discussed a few variants to this, but we should revisit (and this setup will need help from the GitLab team)
- Learn how the "remediations" feature works and ensure Grype plays well with it
- Final vetting to ensure readiness for customers
Questions and notes I've accumulated so far:
-
I had some trouble with the documentation for running integration tests:
- For running within a container, I found that
bundle exec rake integration
doesn't work, because it tries to execute adocker ...
command without access to the Docker client. I also found that I had to runbundle install
prior to being able to run the project within the container. - The command
docker run ...
wasn't executing correctly because there were new lines without continuations (i.e.\
). I've fixed this in my branch.
- For running within a container, I found that
-
I noticed that in addition to specifying the GitLab data format via a template that Trivy processes directly, there is also logic downstream in the analyzer (such as in
converter.rb
andvulnerability.rb
) that further transforms the data. What are the guiding architectural principles here?- One thought I had was that the scanner itself (Trivy and Grype) should be responsible for preparing data in as "final" a state as possible, to minimize scanner-specific logic in the analyzer needed to finalize the report data.
- Some data appears to be added to Trivy's output via the template, where it might be best inserted downstream in the analyzer code, such as the scan
start_time
,end_time
, andstatus
. Right now, it looks like the template has the scanner adding these as empty fields in its output, and then the field values are modified downstream in the analyzer code.
-
I had some questions specifically on the scanner output template and the resulting data...
- It looks like the scanner (
... "id": "trivy" ...
) is identified both at the root of the report JSON document as well as within each individualvulnerabilities
array item. I'm guessing this is done intentionally, so I was just curious what the motivation is. - Vulnerabilities have a
cve
field. There are occasional cases where a vulnerability won't have a referenced CVE. How should we handle this? The two paths I can think of are: a) use the vulnerability ID value anyway, and b) discard the vulnerability altogether. I also see there is aidentifiers
field further down — perhaps this is a better source of truth?
- It looks like the scanner (
I'm looking forward to your feedback and more discussion!