Skip to content

Normalize SBoM component names on ingestion

Brian Williams requested to merge bwill/normalize-component-names into master

What does this MR do and why?

Describe in detail what your merge request does and why.

Addresses: #375765 (closed)

Certain packages, most notably Python, require normalization since different names can point to the same package.

For example, when specifying flask-sqlalchemy or Flask_SQLAlchemy in requirements.txt, both of these will download https://pypi.org/project/flask-sqlalchemy/. This MR implements normalization on SBoM component names when storing them, so that we will not have separate entries in the DB if we try to create a pypi component named flask-sqlalchemy and a pypi component named Flask_SQLAlchemy. We use the PackageUrl normalizer implemented in !103406 (merged) in order to do this. Note that purl_type is from (an instance of ::Sbom::PackageUrl).type.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Ensure that you have an EE license

  2. Enable the cyclonedx_sbom_ingestion feature flag using the rails console: Feature.enable(:cyclonedx_sbom_ingestion)

  3. Setup gitlab runner

  4. Create a new project from a template, use the NodeJS/Express template.

  5. Create a .gitlab-ci.yml file with this configuration:

    include:
      - template: Security/Dependency-Scanning.gitlab-ci.yml
  6. Add a requirements.txt file that contains Django. A new pipeline is triggered.

  7. Verify that the gemnasium-dependency_scanning outputs a gl-sbom-npm-npm.cdx.json artifact

  8. Connect to the DB with gdk psql. Run this query:

    select
      name, version, purl_type
    from
      sbom_components
    inner join sbom_component_versions
      on sbom_components.id = sbom_component_versions.component_id
    inner join sbom_occurrences
      on sbom_component_versions.id = sbom_occurrences.component_version_id
    where pipeline_id = YOUR_PIPELINE_ID;
  9. Ensure that:

    • It contains a record such as name is cookie-parser, and PURL type is npm (6).
    • It contains a record such as name is django, and PURL type is pypi (8).
    • It does NOT contain a record such as name is Django, and PURL type is pypi (8).

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Brian Williams

Merge request reports

Loading