Rewrite group sbom_occurrences query
What does this MR do and why?
Do aggregations for only 25 occurrences at a time for improved performance. This prevents us from needing to scan all sbom_occurrences in the group.
Relates to: #437651 (closed)
💾 Database
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/26864/commands/83792
WITH "namespaces" AS MATERIALIZED (
SELECT
"namespaces"."traversal_ids"
FROM
"namespaces"
WHERE
"namespaces"."type" = 'Group'
AND (traversal_ids >= ('{9970}'))
AND (traversal_ids < ('{9971}'))
)
SELECT
MIN(outer_occurrences.id) AS id,
"outer_occurrences"."component_id",
"outer_occurrences"."component_version_id",
SUM(counts.occurrence_count) AS occurrence_count,
SUM(counts.project_count) AS project_count
FROM ( SELECT DISTINCT ON ("inner_occurrences"."component_id", "inner_occurrences"."component_version_id")
"inner_occurrences"."id",
"inner_occurrences"."created_at",
"inner_occurrences"."updated_at",
"inner_occurrences"."component_version_id",
"inner_occurrences"."project_id",
"inner_occurrences"."pipeline_id",
"inner_occurrences"."source_id",
"inner_occurrences"."commit_sha",
"inner_occurrences"."component_id",
"inner_occurrences"."uuid",
"inner_occurrences"."package_manager",
"inner_occurrences"."component_name",
"inner_occurrences"."input_file_path",
"inner_occurrences"."licenses",
"inner_occurrences"."highest_severity",
"inner_occurrences"."vulnerability_count",
"inner_occurrences"."source_package_id",
"inner_occurrences"."archived",
"inner_occurrences"."traversal_ids",
"inner_occurrences"."ancestors"
FROM
namespaces,
LATERAL ( SELECT DISTINCT ON ("sbom_occurrences"."component_id", "sbom_occurrences"."component_version_id")
"sbom_occurrences"."id",
"sbom_occurrences"."created_at",
"sbom_occurrences"."updated_at",
"sbom_occurrences"."component_version_id",
"sbom_occurrences"."project_id",
"sbom_occurrences"."pipeline_id",
"sbom_occurrences"."source_id",
"sbom_occurrences"."commit_sha",
"sbom_occurrences"."component_id",
"sbom_occurrences"."uuid",
"sbom_occurrences"."package_manager",
"sbom_occurrences"."component_name",
"sbom_occurrences"."input_file_path",
"sbom_occurrences"."licenses",
"sbom_occurrences"."highest_severity",
"sbom_occurrences"."vulnerability_count",
"sbom_occurrences"."source_package_id",
"sbom_occurrences"."archived",
"sbom_occurrences"."traversal_ids",
"sbom_occurrences"."ancestors"
FROM
"sbom_occurrences"
WHERE (sbom_occurrences.traversal_ids = namespaces.traversal_ids::bigint[])
ORDER BY
"sbom_occurrences"."component_id" ASC,
"sbom_occurrences"."component_version_id" ASC
LIMIT 25) inner_occurrences
ORDER BY
inner_occurrences.component_id ASC,
inner_occurrences.component_version_id ASC
LIMIT 25) outer_occurrences,
LATERAL (
SELECT
COUNT(project_id) AS occurrence_count,
COUNT(DISTINCT project_id) project_count
FROM
"sbom_occurrences"
WHERE (traversal_ids >= ('{9970}'))
AND (traversal_ids < ('{9971}'))
AND (sbom_occurrences.component_version_id = outer_occurrences.component_version_id)) counts
GROUP BY
"outer_occurrences"."component_id",
"outer_occurrences"."component_version_id"
ORDER BY
MIN(outer_occurrences.component_id) ASC,
MIN(outer_occurrences.component_version_id) ASC;
Sort (cost=127.19..127.26 rows=25 width=104) (actual time=569.300..569.307 rows=25 loops=1)
Sort Key: (min(sbom_occurrences.component_id)), (min(sbom_occurrences.component_version_id))
Sort Method: quicksort Memory: 28kB
Buffers: shared hit=4146 read=1232
I/O Timings: read=542.983 write=0.000
CTE namespaces
-> Index Only Scan using index_namespaces_on_traversal_ids_for_groups_btree on public.namespaces namespaces_1 (cost=0.56..3.58 rows=1 width=28) (actual time=13.221..396.816 rows=841 loops=1)
Index Cond: ((namespaces_1.traversal_ids >= '{9970}'::integer[]) AND (namespaces_1.traversal_ids < '{9971}'::integer[]))
Heap Fetches: 122
Buffers: shared hit=517 read=148
I/O Timings: read=393.504 write=0.000
-> Aggregate (cost=34.88..123.04 rows=25 width=104) (actual time=565.239..569.211 rows=25 loops=1)
Group Key: sbom_occurrences.component_id, sbom_occurrences.component_version_id
Buffers: shared hit=4143 read=1232
I/O Timings: read=542.983 write=0.000
-> Nested Loop (cost=34.88..122.22 rows=25 width=40) (actual time=564.938..569.159 rows=25 loops=1)
Buffers: shared hit=4143 read=1232
I/O Timings: read=542.983 write=0.000
-> Limit (cost=31.29..31.47 rows=25 width=319) (actual time=564.560..564.591 rows=25 loops=1)
Buffers: shared hit=4024 read=1171
I/O Timings: read=539.142 write=0.000
-> Unique (cost=31.29..31.47 rows=25 width=319) (actual time=564.557..564.583 rows=25 loops=1)
Buffers: shared hit=4024 read=1171
I/O Timings: read=539.142 write=0.000
-> Sort (cost=31.29..31.35 rows=25 width=319) (actual time=564.555..564.566 rows=44 loops=1)
Sort Key: sbom_occurrences.component_id, sbom_occurrences.component_version_id
Sort Method: quicksort Memory: 93kB
Buffers: shared hit=4024 read=1171
I/O Timings: read=539.142 write=0.000
-> Nested Loop (cost=0.59..30.71 rows=25 width=319) (actual time=25.832..563.489 rows=878 loops=1)
Buffers: shared hit=4024 read=1171
I/O Timings: read=539.142 write=0.000
-> CTE Scan on namespaces (cost=0.00..0.02 rows=1 width=32) (actual time=13.226..397.849 rows=841 loops=1)
Buffers: shared hit=517 read=148
I/O Timings: read=393.504 write=0.000
-> Limit (cost=0.59..30.19 rows=25 width=319) (actual time=0.110..0.195 rows=1 loops=841)
Buffers: shared hit=3507 read=1023
I/O Timings: read=145.638 write=0.000
-> Unique (cost=0.59..6164.94 rows=5207 width=319) (actual time=0.109..0.195 rows=1 loops=841)
Buffers: shared hit=3507 read=1023
I/O Timings: read=145.638 write=0.000
-> Index Scan using index_sbom_occurrences_for_aggregations on public.sbom_occurrences (cost=0.59..6138.88 rows=5212 width=319) (actual time=0.105..0.189 rows=1 loops=841)
Index Cond: (sbom_occurrences.traversal_ids = (namespaces.traversal_ids)::bigint[])
Buffers: shared hit=3507 read=1023
I/O Timings: read=145.638 write=0.000
-> Aggregate (cost=3.59..3.60 rows=1 width=16) (actual time=0.181..0.181 rows=1 loops=25)
Buffers: shared hit=119 read=61
I/O Timings: read=3.841 write=0.000
-> Index Scan using idx_sbom_occurrences_on_component_version_id_and_traversal_ids on public.sbom_occurrences sbom_occurrences_1 (cost=0.56..3.58 rows=1 width=8) (actual time=0.169..0.175 rows=3 loops=25)
Index Cond: ((sbom_occurrences_1.component_version_id = sbom_occurrences.component_version_id) AND (sbom_occurrences_1.traversal_ids >= '{9970}'::bigint[]) AND (sbom_occurrences_1.traversal_ids < '{9971}'::bigint[]))
Buffers: shared hit=119 read=61
I/O Timings: read=3.841 write=0.000
Time: 1.255 s
- planning: 683.960 ms
- execution: 570.939 ms
- I/O read: 542.983 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 4146 (~32.40 MiB) from the buffer pool
- reads: 1232 (~9.60 MiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
Edited by Brian Williams