Add partition number to security findings query
What does this MR do and why?
Add partition number to security findings query
All records for a particular pipeline are guaranteed to be on the same
partition in the security_findings
table so we can add the partition
number as a parameter when querying the table for a pipeline.
# EE::Ci::Pipeline
# We want all the `security_findings` records for a particular pipeline to be stored in
# the same partition, therefore, we check if the pipeline already has a `security_scan`.
#
# - If it has, then we use the partition number of the existing security_scan to make sure
# that the new `security_findings` will be stored in the same partition with the existing ones.
# - If it does not have a security_scan yet, then we can basically use the latest partition
# of the `security_findings` table.
def security_findings_partition_number
@security_findings_partition_number ||= security_scans.first&.findings_partition_number || Security::Finding.active_partition_number
end
This change adds:
... FROM security_findings WHERE security_findings.partition_number = n
to the query in Security::FindingsFinder
.
As a result of this the query only needs to visit one partition. Without this clause all active partitions are scanner - 12 at the time of writing.
Note: This change does not drastically change the buffer read or
time performance of the query. But it does dramatically reduce the number of locks taken. I have not listed them here,
they can be seen through the postgres.ai links, but the number of AccessShareLock
s on the query
drops from 168 before to 60 after.
Plan before
explain SELECT
"security_findings".*
FROM
"security_scans",
unnest('{1,2,4,5,6,7}'::smallint[]) AS "severities" ("severity"),
LATERAL (
SELECT
"security_findings".*
FROM
"security_findings"
WHERE ("security_findings"."scan_id" = "security_scans"."id")
AND ("security_findings"."severity" = "severities"."severity")
AND "security_findings"."deduplicated" = TRUE
AND "security_findings"."confidence" IN (1, 2, 3, 4, 5, 6, 7)
AND (NOT EXISTS (
SELECT
1
FROM
"vulnerabilities"
INNER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."vulnerability_id" = "vulnerabilities"."id"
WHERE
"vulnerabilities"."state" = 2
AND (vulnerability_occurrences.uuid::uuid = security_findings.uuid::uuid)))
ORDER BY
"security_findings"."severity" DESC,
"security_findings"."id" ASC
LIMIT 20) AS "security_findings"
WHERE
"security_scans"."pipeline_id" = 967613589
AND "security_scans"."latest" = TRUE
AND "security_scans"."status" = 1
AND "security_scans"."scan_type" IN (1, 2, 3, 4, 5, 6, 7)
ORDER BY
"security_findings"."severity" DESC,
"security_findings"."id" ASC
LIMIT 20 OFFSET 0
Limit (cost=1498.31..1498.36 rows=20 width=1153) (actual time=26.516..26.523 rows=20 loops=1)
Buffers: shared hit=9160
I/O Timings: read=0.000 write=0.000
-> Sort (cost=1498.31..1498.61 rows=120 width=1153) (actual time=26.515..26.520 rows=20 loops=1)
Sort Key: security_findings.severity DESC, security_findings.id
Sort Method: top-N heapsort Memory: 66kB
Buffers: shared hit=9160
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=9.32..1495.12 rows=120 width=1153) (actual time=22.880..26.373 rows=80 loops=1)
Buffers: shared hit=9157
I/O Timings: read=0.000 write=0.000
-> Index Scan using index_security_scans_on_pipeline_id_and_scan_type on public.security_scans (cost=0.57..7.63 rows=1 width=8) (actual time=0.026..0.028 rows=1 loops=1)
Index Cond: (security_scans.pipeline_id = 967613589)
Filter: (security_scans.latest AND (security_scans.status = 1) AND (security_scans.scan_type = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=8
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=8.75..1486.28 rows=120 width=1153) (actual time=22.851..26.323 rows=80 loops=1)
Buffers: shared hit=9149
I/O Timings: read=0.000 write=0.000
-> Function Scan on unnest severities (cost=0.00..0.06 rows=6 width=2) (actual time=0.009..0.013 rows=6 loops=1)
I/O Timings: read=0.000 write=0.000
-> Limit (cost=8.75..247.30 rows=20 width=1153) (actual time=4.194..4.381 rows=13 loops=6)
Buffers: shared hit=9149
I/O Timings: read=0.000 write=0.000
-> Nested Loop Anti Join (cost=8.75..26822.08 rows=2248 width=1153) (actual time=4.193..4.378 rows=13 loops=6)
Buffers: shared hit=9149
I/O Timings: read=0.000 write=0.000
-> Merge Append (cost=7.62..15525.03 rows=2427 width=1153) (actual time=4.167..4.175 rows=13 loops=6)
Sort Key: security_findings.id
Buffers: shared hit=8399
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_69_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_69 security_findings_1 (cost=0.57..1344.60 rows=212 width=1144) (actual time=0.007..0.007 rows=0 loops=6)
Index Cond: (security_findings_1.scan_id = security_scans.id)
Filter: (security_findings_1.deduplicated AND (security_findings_1.severity = severities.severity) AND (security_findings_1.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_70_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_70 security_findings_2 (cost=0.57..1366.13 rows=217 width=1116) (actual time=0.005..0.005 rows=0 loops=6)
Index Cond: (security_findings_2.scan_id = security_scans.id)
Filter: (security_findings_2.deduplicated AND (security_findings_2.severity = severities.severity) AND (security_findings_2.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_71_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_71 security_findings_3 (cost=0.57..1247.04 rows=198 width=1106) (actual time=0.007..0.007 rows=0 loops=6)
Index Cond: (security_findings_3.scan_id = security_scans.id)
Filter: (security_findings_3.deduplicated AND (security_findings_3.severity = severities.severity) AND (security_findings_3.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_72_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_72 security_findings_4 (cost=0.57..1218.64 rows=192 width=1115) (actual time=4.088..4.094 rows=13 loops=6)
Index Cond: (security_findings_4.scan_id = security_scans.id)
Filter: (security_findings_4.deduplicated AND (security_findings_4.severity = severities.severity) AND (security_findings_4.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 9756
Buffers: shared hit=8111
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_73_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_73 security_findings_5 (cost=0.57..1155.55 rows=182 width=1127) (actual time=0.007..0.007 rows=0 loops=6)
Index Cond: (security_findings_5.scan_id = security_scans.id)
Filter: (security_findings_5.deduplicated AND (security_findings_5.severity = severities.severity) AND (security_findings_5.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_74_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_74 security_findings_6 (cost=0.57..1251.21 rows=196 width=1154) (actual time=0.005..0.005 rows=0 loops=6)
Index Cond: (security_findings_6.scan_id = security_scans.id)
Filter: (security_findings_6.deduplicated AND (security_findings_6.severity = severities.severity) AND (security_findings_6.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_75_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_75 security_findings_7 (cost=0.57..1238.34 rows=193 width=1175) (actual time=0.005..0.005 rows=0 loops=6)
Index Cond: (security_findings_7.scan_id = security_scans.id)
Filter: (security_findings_7.deduplicated AND (security_findings_7.severity = severities.severity) AND (security_findings_7.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_76_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_76 security_findings_8 (cost=0.57..1294.98 rows=202 width=1174) (actual time=0.007..0.007 rows=0 loops=6)
Index Cond: (security_findings_8.scan_id = security_scans.id)
Filter: (security_findings_8.deduplicated AND (security_findings_8.severity = severities.severity) AND (security_findings_8.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_77_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_77 security_findings_9 (cost=0.57..1230.44 rows=191 width=1190) (actual time=0.006..0.006 rows=0 loops=6)
Index Cond: (security_findings_9.scan_id = security_scans.id)
Filter: (security_findings_9.deduplicated AND (security_findings_9.severity = severities.severity) AND (security_findings_9.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_78_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_78 security_findings_10 (cost=0.57..1246.73 rows=194 width=1196) (actual time=0.006..0.006 rows=0 loops=6)
Index Cond: (security_findings_10.scan_id = security_scans.id)
Filter: (security_findings_10.deduplicated AND (security_findings_10.severity = severities.severity) AND (security_findings_10.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_79_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_79 security_findings_11 (cost=0.57..1154.09 rows=179 width=1209) (actual time=0.004..0.004 rows=0 loops=6)
Index Cond: (security_findings_11.scan_id = security_scans.id)
Filter: (security_findings_11.deduplicated AND (security_findings_11.severity = severities.severity) AND (security_findings_11.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_80_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_80 security_findings_12 (cost=0.57..1013.73 rows=161 width=1133) (actual time=0.006..0.006 rows=0 loops=6)
Index Cond: (security_findings_12.scan_id = security_scans.id)
Filter: (security_findings_12.deduplicated AND (security_findings_12.severity = severities.severity) AND (security_findings_12.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_81_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_81 security_findings_13 (cost=0.56..706.29 rows=110 width=1154) (actual time=0.005..0.005 rows=0 loops=6)
Index Cond: (security_findings_13.scan_id = security_scans.id)
Filter: (security_findings_13.deduplicated AND (security_findings_13.severity = severities.severity) AND (security_findings_13.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=24
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=1.14..4.64 rows=1 width=16) (actual time=0.015..0.015 rows=0 loops=80)
Buffers: shared hit=750
I/O Timings: read=0.000 write=0.000
-> Index Only Scan using index_vuln_findings_on_uuid_including_vuln_id_1 on public.vulnerability_occurrences (cost=0.57..2.11 rows=1 width=24) (actual time=0.009..0.009 rows=1 loops=80)
Index Cond: (vulnerability_occurrences.uuid = security_findings.uuid)
Heap Fetches: 2
Buffers: shared hit=350
I/O Timings: read=0.000 write=0.000
-> Index Scan using vulnerabilities_pkey on public.vulnerabilities (cost=0.57..2.53 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=80)
Index Cond: (vulnerabilities.id = vulnerability_occurrences.vulnerability_id)
Filter: (vulnerabilities.state = 2)
Rows Removed by Filter: 1
Buffers: shared hit=400
I/O Timings: read=0.000 write=0.000
Time: 60.193 ms
- planning: 33.267 ms
- execution: 26.926 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 9160 (~71.60 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23297/commands/74901
Plan after
SELECT
"security_findings".*
FROM
"security_scans",
unnest('{1,2,4,5,6,7}'::smallint[]) AS "severities" ("severity"),
LATERAL (
SELECT
"security_findings".*
FROM
"security_findings"
WHERE ("security_findings"."scan_id" = "security_scans"."id")
AND ("security_findings"."severity" = "severities"."severity")
AND "security_findings"."deduplicated" = TRUE
AND "security_findings"."partition_number" = 72
AND "security_findings"."confidence" IN (1, 2, 3, 4, 5, 6, 7)
AND (NOT EXISTS (
SELECT
1
FROM
"vulnerabilities"
INNER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."vulnerability_id" = "vulnerabilities"."id"
WHERE
"vulnerabilities"."state" = 2
AND (vulnerability_occurrences.uuid::uuid = security_findings.uuid::uuid)))
ORDER BY
"security_findings"."severity" DESC,
"security_findings"."id" ASC
LIMIT 20) AS "security_findings"
WHERE
"security_scans"."pipeline_id" = 967613589
AND "security_scans"."latest" = TRUE
AND "security_scans"."status" = 1
AND "security_scans"."scan_type" IN (1, 2, 3, 4, 5, 6, 7)
ORDER BY
"security_findings"."severity" DESC,
"security_findings"."id" ASC
LIMIT 20 OFFSET 0
Limit (cost=1204.18..1204.23 rows=20 width=1115) (actual time=32.620..32.626 rows=20 loops=1)
Buffers: shared hit=8875
I/O Timings: read=0.000 write=0.000
-> Sort (cost=1204.18..1204.48 rows=120 width=1115) (actual time=32.618..32.622 rows=20 loops=1)
Sort Key: security_findings.severity DESC, security_findings.id
Sort Method: top-N heapsort Memory: 66kB
Buffers: shared hit=8875
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=2.27..1200.99 rows=120 width=1115) (actual time=28.406..32.442 rows=80 loops=1)
Buffers: shared hit=8869
I/O Timings: read=0.000 write=0.000
-> Index Scan using index_security_scans_on_pipeline_id_and_scan_type on public.security_scans (cost=0.57..7.63 rows=1 width=8) (actual time=0.036..0.038 rows=1 loops=1)
Index Cond: (security_scans.pipeline_id = 967613589)
Filter: (security_scans.latest AND (security_scans.status = 1) AND (security_scans.scan_type = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=8
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=1.71..1192.15 rows=120 width=1115) (actual time=28.366..32.379 rows=80 loops=1)
Buffers: shared hit=8861
I/O Timings: read=0.000 write=0.000
-> Function Scan on unnest severities (cost=0.00..0.06 rows=6 width=2) (actual time=0.029..0.036 rows=6 loops=1)
I/O Timings: read=0.000 write=0.000
-> Limit (cost=1.70..198.28 rows=20 width=1115) (actual time=5.182..5.384 rows=13 loops=6)
Buffers: shared hit=8861
I/O Timings: read=0.000 write=0.000
-> Nested Loop Anti Join (cost=1.70..1751.26 rows=178 width=1115) (actual time=5.181..5.381 rows=13 loops=6)
Buffers: shared hit=8861
I/O Timings: read=0.000 write=0.000
-> Index Scan using security_findings_72_scan_id_id_idx on gitlab_partitions_dynamic.security_findings_72 security_findings (cost=0.57..1221.60 rows=192 width=1115) (actual time=5.147..5.154 rows=13 loops=6)
Index Cond: (security_findings.scan_id = security_scans.id)
Filter: (security_findings.deduplicated AND (security_findings.severity = severities.severity) AND (security_findings.partition_number = 72) AND (security_findings.confidence = ANY ('{1,2,3,4,5,6,7}'::integer[])))
Rows Removed by Filter: 9756
Buffers: shared hit=8111
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=1.14..2.75 rows=1 width=16) (actual time=0.016..0.016 rows=0 loops=80)
Buffers: shared hit=750
I/O Timings: read=0.000 write=0.000
-> Index Only Scan using index_vuln_findings_on_uuid_including_vuln_id_1 on public.vulnerability_occurrences (cost=0.57..2.12 rows=1 width=24) (actual time=0.010..0.010 rows=1 loops=80)
Index Cond: (vulnerability_occurrences.uuid = security_findings.uuid)
Heap Fetches: 2
Buffers: shared hit=350
I/O Timings: read=0.000 write=0.000
-> Index Scan using vulnerabilities_pkey on public.vulnerabilities (cost=0.57..0.63 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=80)
Index Cond: (vulnerabilities.id = vulnerability_occurrences.vulnerability_id)
Filter: (vulnerabilities.state = 2)
Rows Removed by Filter: 1
Buffers: shared hit=400
I/O Timings: read=0.000 write=0.000
Time: 46.054 ms
- planning: 13.200 ms
- execution: 32.854 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 8875 (~69.30 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23297/commands/74902
Related to #381302