Normalize SBoM component names on ingestion
What does this MR do and why?
Describe in detail what your merge request does and why.
Addresses: #375765 (closed)
Certain packages, most notably Python, require normalization since different names can point to the same package.
For example, when specifying flask-sqlalchemy
or Flask_SQLAlchemy
in requirements.txt, both of these will
download https://pypi.org/project/flask-sqlalchemy/. This MR implements normalization on SBoM component names
when storing them, so that we will not have separate entries in the DB if we try to create a pypi component named
flask-sqlalchemy
and a pypi component named Flask_SQLAlchemy
. We use the PackageUrl normalizer implemented in
!103406 (merged) in order to do this. Note that purl_type
is from
(an instance of ::Sbom::PackageUrl).type
.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
-
Enable the
cyclonedx_sbom_ingestion
feature flag using the rails console:Feature.enable(:cyclonedx_sbom_ingestion)
-
Create a new project from a template, use the NodeJS/Express template.
-
Create a
.gitlab-ci.yml
file with this configuration:include: - template: Security/Dependency-Scanning.gitlab-ci.yml
-
Add a
requirements.txt
file that contains Django. A new pipeline is triggered. -
Verify that the
gemnasium-dependency_scanning
outputs agl-sbom-npm-npm.cdx.json
artifact -
Connect to the DB with
gdk psql
. Run this query:select name, version, purl_type from sbom_components inner join sbom_component_versions on sbom_components.id = sbom_component_versions.component_id inner join sbom_occurrences on sbom_component_versions.id = sbom_occurrences.component_version_id where pipeline_id = YOUR_PIPELINE_ID;
-
Ensure that:
-
It contains a record such as name is cookie-parser
, and PURL type isnpm
(6). -
It contains a record such as name is django
, and PURL type ispypi
(8). -
It does NOT contain a record such as name is Django
, and PURL type ispypi
(8).
-
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.