Allow filtering the dependency list by unknown license
What does this MR do and why?
Allow for unknown licenses to be searched for on the dependency list via
the licenses[]=unknown
query string parameter. A dependency's license is "unknown"
if it does not have any licenses associated with it. Licenses are currently stored
in a jsonb column sbom_occurrences.licenses
which consists of an array of objects.
When the user wants to see unknown licenses, we search for sbom_occurrences
where
the licenses column is an empty array.
Query
The existing query is known to perform poorly. See !132304 (comment 1579176970) for the discussion where it was introduced. We plan to address this via multiple improvements in &12371 (closed).
Before:
WITH our_occurrences AS MATERIALIZED (
SELECT
DISTINCT ON ( sbom_occurrences.component_version_id )
sbom_occurrences.*,
count( sbom_occurrences.id ) OVER ( PARTITION BY sbom_occurrences.component_version_id ) AS occurrence_count,
dense_rank() OVER (
PARTITION BY sbom_occurrences.component_version_id
ORDER BY project_id
) AS project_count
FROM
sbom_occurrences
WHERE
sbom_occurrences.project_id IN (
SELECT
projects.id
FROM
projects
WHERE
projects.namespace_id IN (
SELECT
namespaces.id
FROM
unnest( COALESCE( ( SELECT ids FROM ( SELECT namespace_descendants.self_and_descendant_group_ids AS ids FROM namespace_descendants WHERE namespace_descendants.outdated_at IS NULL AND namespace_descendants.namespace_id = 9970 ) AS cached_query ), ( SELECT ids FROM ( SELECT array_agg( namespaces.id ) AS ids FROM ( SELECT namespaces.traversal_ids[array_length( namespaces.traversal_ids, 1 )] AS id FROM namespaces WHERE namespaces.type = 'Group' AND traversal_ids @> '{9970}' ) AS namespaces ) AS consistent_query ) ) ) AS namespaces ( id )
) AND
projects.marked_for_deletion_at IS NULL AND
projects.pending_delete = false
) AND
sbom_occurrences.component_version_id IS NOT NULL
ORDER BY
sbom_occurrences.component_version_id DESC,
project_count DESC,
sbom_occurrences.id DESC
)
SELECT
id,
created_at,
updated_at,
component_version_id,
project_id,
pipeline_id,
source_id,
commit_sha,
component_id,
uuid,
package_manager,
component_name,
input_file_path,
licenses,
highest_severity,
vulnerability_count,
source_package_id,
occurrence_count,
project_count
FROM
our_occurrences AS sbom_occurrences
WHERE
( licenses #> '{0,spdx_identifier}' ) ?| ARRAY[ 'Apache-2.0' ] OR
( licenses #> '{1,spdx_identifier}' ) ?| ARRAY[ 'Apache-2.0' ]
ORDER BY
id ASC;
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/25787/commands/81283
After:
WITH our_occurrences AS MATERIALIZED (
SELECT
DISTINCT ON ( sbom_occurrences.component_version_id )
sbom_occurrences.*,
count( sbom_occurrences.id ) OVER ( PARTITION BY sbom_occurrences.component_version_id ) AS occurrence_count,
dense_rank() OVER (
PARTITION BY sbom_occurrences.component_version_id
ORDER BY project_id
) AS project_count
FROM
sbom_occurrences
WHERE
sbom_occurrences.project_id IN (
SELECT
projects.id
FROM
projects
WHERE
projects.namespace_id IN (
SELECT
namespaces.id
FROM
unnest( COALESCE( ( SELECT ids FROM ( SELECT namespace_descendants.self_and_descendant_group_ids AS ids FROM namespace_descendants WHERE namespace_descendants.outdated_at IS NULL AND namespace_descendants.namespace_id = 9970 ) AS cached_query ), ( SELECT ids FROM ( SELECT array_agg( namespaces.id ) AS ids FROM ( SELECT namespaces.traversal_ids[array_length( namespaces.traversal_ids, 1 )] AS id FROM namespaces WHERE namespaces.type = 'Group' AND traversal_ids @> '{9970}' ) AS namespaces ) AS consistent_query ) ) ) AS namespaces ( id )
) AND
projects.marked_for_deletion_at IS NULL AND
projects.pending_delete = false
) AND
sbom_occurrences.component_version_id IS NOT NULL
ORDER BY
sbom_occurrences.component_version_id DESC,
project_count DESC,
sbom_occurrences.id DESC
)
SELECT
id,
created_at,
updated_at,
component_version_id,
project_id,
pipeline_id,
source_id,
commit_sha,
component_id,
uuid,
package_manager,
component_name,
input_file_path,
licenses,
highest_severity,
vulnerability_count,
source_package_id,
occurrence_count,
project_count
FROM
our_occurrences AS sbom_occurrences
WHERE
( licenses #> '{0,spdx_identifier}' ) ?| ARRAY[ 'Apache-2.0' ] OR
( licenses #> '{1,spdx_identifier}' ) ?| ARRAY[ 'Apache-2.0' ] OR
licenses = '[]'
ORDER BY
id ASC;
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/25787/commands/81284
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
- Find or create a group with unknown licenses in the dependency list
- Go to
<gdk_url>/groups/<group_path>/-/dependencies.json?licenses[]=unknown
- Licenses should appear in the list