Ingest source package name from Trivy SBOM component properties
Proposal
As discussed here, when looking up advisories for a package, trivy
first uses the source package
, if available, and falls back to the package name
. For example, the package libperl5.38
has Source: perl
listed in the dpkg
manifest:
$ docker run -it --rm registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/python:5db727fd3df8c65d8d85ed470ee79624d728217c bash
root@547bb47b6a06:/# grep -A 9 'Package: libperl5.38' /var/lib/dpkg/status
Package: libperl5.38
Status: install ok installed
Priority: optional
Section: libs
Installed-Size: 29325
Maintainer: Niko Tyni <ntyni@debian.org>
Architecture: amd64
Multi-Arch: same
Source: perl <--------------------------------------- SOURCE PACKAGE IS `perl`
Version: 5.38.0-2
As such, the trivy-db that we use for the source of advisories does not contain vulnerability information for the package libperl5.38
but instead contains advisory information for the source package perl
.
When trivy
scans an image, if a source package has a vulnerability, trivy
considers all packages that have the same source package
as being vulnerable.
For example, if perl <= 5.38.0-2
is vulnerable to a particular CVE, then the following packages are also vulnerable, because they all list perl
as the source package
:
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | libperl5.38 | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | perl | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | perl-base | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | perl-modules-5.38 | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
(see this job for details)
When we ingest an SBOM for Container Scanning, we currently only store the following fields:
type
name
purl
version
For example, for the package libperl5.38
, we have the following fields and values:
Field | Value |
---|---|
type | library |
name | libperl5.38 |
purl | pkg:deb/debian/libperl5.38@5.38.0-2?distro=debian-12.1 |
version | 5.38.0-2 |
This presents a problem, because as stated earlier, the trivy-db
does not contain affected package information for libperl5.38
, but instead for the source package perl
, however, we currently have no way of correlating the libperl5.38
package to the source package perl
from only the above details.
However, the source SBOM does contain this information in the properties
field, we just don't currently ingest it.
For example, trivy
produces an SBOM with the source package perl
in the aquasecurity:trivy:SrcName
property:
Click to expand trivy-produced SBOM
{
"components": [
{
"bom-ref": "pkg:deb/debian/libperl5.38@5.38.0-2?distro=debian-12.1",
"type": "library",
"name": "libperl5.38",
"version": "5.38.0-2",
"purl": "pkg:deb/debian/libperl5.38@5.38.0-2?distro=debian-12.1",
"properties": [
{
"name": "aquasecurity:trivy:SrcName",
"value": "perl"
}
]
}
And, syft
produces an SBOM with the source package perl
in the syft:metadata:source
property:
Click to expand syft-produced SBOM
{
"components": [
{
"bom-ref": "pkg:deb/debian/libperl5.38@5.38.0-2?arch=amd64&upstream=perl&distro=debian-12&package-id=c2dfca7103136fcb",
"type": "library",
"publisher": "Niko Tyni <ntyni@debian.org>",
"name": "libperl5.38",
"version": "5.38.0-2",
"cpe": "cpe:2.3:a:libperl5.38:libperl5.38:5.38.0-2:*:*:*:*:*:*:*",
"purl": "pkg:deb/debian/libperl5.38@5.38.0-2?arch=amd64&upstream=perl&distro=debian-12",
"properties": [
{
"name": "syft:metadata:source",
"value": "perl"
}
]
}
In order to properly match packages such as libperl5.38
against advisories in the trivy-db
for the source package perl
, we need to update the SBOM ingestion code in the rails monolith to also store the source package
from the component.properties
for trivy-produced SBOMs only, which is the purpose of this issue.
Proposals
Previous implementation plan
-
Add a new source_package_name
field to Gitlab::Ci::Reports::Sbom::Component. -
Add a new source_package_name
field to the Sbom::ComponentVersion model:-
Create a migration to add source_package_name
to thesbom_component_versions
table. -
Add a new index to the sbom_component_versions
table:Note: previous implementation plan was about adding a field to
sbom_components
, please, see this threadClick to expand original index suggestion which doesn't work
index_sbom_components_on_component_type_source_package_name_and_purl_type" UNIQUE, btree (source_package_name, purl_type, component_type)
Note: there's a problem with this index due to the
UNIQUE
keyword, as explained here. Because of this, we'll need to remove theUNIQUE
keyword, as shown in the revised index below.Revised index: (as discussed here):
index_sbom_components_on_component_type_source_package_name_and_purl_type" btree (source_package_name, purl_type, component_type)
-
-
Update Gitlab::Ci::Parsers::Sbom::Cyclonedx#parse_components to ingest the components[].properties[].aquasecurity:trivy:SrcName
value and store it insbom_components.source_package_name
. -
Add unit tests
Implementation Plan
-
Add a new sbom_source_packages
table:-
Add sbom_source_packages table (!140539 - merged) • Adam Cohen • 16.8 -
Add timestamps for sbom_source_packages
table to enable the ingestion process. Ingestion framework requires table to have timestamps. Add timestamp for sbom_source_packages (!142006 - merged) • Tetiana Chupryna • 16.9
-
-
Add a source_package_name
method to theSbom::SourceHelper
module. It returns the value ofdata['SrcName']
. -
Delegate the source_package_name
method to theproperties
and allownil
(components may not have any properties).delegate :source_package_name, to: :properties, allow_nil: true
-
Update the Sbom::Ingestion::OccurrenceMap
method so that it includes asource_package_id
accessor. Update the#to_h
method so that it outputssource_package_id: source_package_id
in the resulting hash. Delegate the#source_package_name
to the:report_component
. -
Add a new task to the Sbom::Ingestion::Tasks
namespace. This task will include theGitlab::Ingestion::BulkInsertableTask
module.- Name the task
IngestSourcePackageNames
- Set
self.model
toSbom::SourcePackage
- Set
self.uses
to%i[name purl_type id].freeze
. The:id
will be used to set thesource_package_id
column, and the:name
and:purl_type
are used as a key to for the:id
value in a@maps_grouped_by_uniq_attrs
hash map. - Set
self.unique_by
to%i[name purl_type].freeze
. - Add an
#attributes
method that returns a slice of hashes like so:occurrence_maps.filter(&:source_package_name).map do |occurrence_map| { name: occurrence_map.source_package_name, purl_type: occurrence_map.purl_type } end
- Add an
after_ingest
method that sets the returnid
value as thesource_package_id
using the values from@maps_grouped_by_uniq_attrs
. SeeSbom::Ingestion::Tasks::IngestComponents
for an example implementation.
- Name the task
-
Update the IngestReportSliceService::TASKS
array. Add the newly createdIngestSourcePackageNames
before theIngestOccurrences
task. -
Update the Sbom::Ingestion::Tasks::IngestOccurrences
attributes so that it includessource_package_id: occurrence_map.source_package_id
in the hash output. -
Ensure that the related specs are updated. The following files in ee/spec/services/sbom/ingestion/
will be affected:-
occurrence_map_spec.rb
- test that thesource_package_id
is assigned inwhen ids are assigned
and that it delegates thesource_package_name
correctly. -
tasks/ingest_occurrences_spec.rb
- ensure that the#attributes
method sets thesource_package_id
attribute correctly when it'snil
and when it's notnil
. -
tasks/ingest_source_packages_spec.rb
- ensure that it is idempotent, unique by constraints are utilized, the correctattributes
are used (nil
source package names are removed), and that the expected attributes are set after ingest.- For example, you could verify that the
perl
andperl-base
components both have the samesource_package_id
set because they both belong to theperl
source package.
- For example, you could verify that the
-
Validation testing
- Validate Update PossiblyAffectedOccurrencesFinder to wor... (#428681 - closed).
- Create a project with next content:
.gitlab-ci.yml
variables:
CS_IMAGE: 'golang:1.20-alpine'
include:
- template: Jobs/Container-Scanning.gitlab-ci.yml
- Run a pipeline and make sure that
container_scanning:cyclonedx
report is created
GDK
in Rails console run:
Sbom::ComponentVersion.where(component: Sbom::Componenent.find(name: 'alpine-baselayout-data'))
Check if the field source_package_name
is equal alpine-baselayout
.
GitLab.com
After deploy validate that there is no new errors logged and there is no regression in Group Dependency List.
/cc @gonzoyumo @smeadzinger @fcatteau
🤖
Auto-Summary Discoto Usage
Points
Discussion points are declared by headings, list items, and single lines that start with the text (case-insensitive)
point:
. For example, the following are all valid points:
#### POINT: This is a point
* point: This is a point
+ Point: This is a point
- pOINT: This is a point
point: This is a **point**
Note that any markdown used in the point text will also be propagated into the topic summaries.
Topics
Topics can be stand-alone and contained within an issuable (epic, issue, MR), or can be inline.
Inline topics are defined by creating a new thread (discussion) where the first line of the first comment is a heading that starts with (case-insensitive)
topic:
. For example, the following are all valid topics:
# Topic: Inline discussion topic 1
## TOPIC: **{+A Green, bolded topic+}**
### tOpIc: Another topic
Quick Actions
Action Description /discuss sub-topic TITLE
Create an issue for a sub-topic. Does not work in epics /discuss link ISSUABLE-LINK
Link an issuable as a child of this discussion
Last updated by this job
Discoto Settings
---
summary:
max_items: -1
sort_by: created
sort_direction: ascending
See the settings schema for details.