Improve delete_software_license_policies query
What does this MR do and why?
This MR improves the performance of delete query to include the scan_result_policy_reads
of the concerned project only to reduce the number of IDs filtered. This MR also changes the query to sub-query instead of IN
query.
This is a corrective action of Incident: gitlab-com/gl-infra/production#17168 (closed)
The security_orchestration_policy_configuration_id
that caused the slow query is scoped to a namespace that has 54706 project. And each project has only 1 rows of scan_result_policies
but since we do policy_configuration. scan_result_policy_reads.pluck(:id)
it selects 54701
IDs.
More context on the data: gitlab-com/gl-infra/production#17168 (comment 1663027970)
Database
Before
SELECT
"scan_result_policies"."id"
FROM
"scan_result_policies"
WHERE
"scan_result_policies"."security_orchestration_policy_configuration_id" = 1029627
SELECT
"software_license_policies".*
FROM
"software_license_policies"
WHERE
"software_license_policies"."project_id" = 24583235
AND "software_license_policies"."scan_result_policy_id" IN (10324607, 10326279, 10324491, 10326623, 10324468, 10326230, 10324484, 10326628, 10324495, 10326710, 10324552, 10326753, 10324563, 10326804, 10324572, 10326875, 10324586, 10326412, 10324744, 10326386, 10324627, 10326702, 10324675, 10326810, 10324550, 10327045, 10324597, 10326937, 10324913, 10327059, 10324493, 10326617, 10324669, 10327301, 10324615, 10326282, 10324490, 10326621, 10324472, 10326246, 10324469, 10326235, 10324467, 10326553, 10324588, 10326624, 10324473, 10326227, 10324482, 10326228, 10324537, 10327189, 10324505, 10326631, 10324576, 10327187, 10324585, 10326312, 10324489, 10326604, 10324511, 10326644)
After
SELECT
"software_license_policies".*
FROM
"software_license_policies"
WHERE
"software_license_policies"."project_id" = 43110842
AND "software_license_policies"."scan_result_policy_id" IN (
SELECT
"scan_result_policies"."id"
FROM
"scan_result_policies"
WHERE
"scan_result_policies"."security_orchestration_policy_configuration_id" = 4261
AND "scan_result_policies"."project_id" = 43110842
Query Plan : https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/24184/commands/77181
Nested Loop (cost=0.85..18.21 rows=1 width=40) (actual time=16.599..266.054 rows=456 loops=1)
Buffers: shared hit=149 read=309 dirtied=185
I/O Timings: read=257.544 write=0.000
-> Index Scan using index_scan_result_policies_on_position_in_configuration on public.scan_result_policies (cost=0.43..3.45 rows=1 width=8) (actual time=8.047..8.049 rows=1 loops=1)
Index Cond: ((scan_result_policies.security_orchestration_policy_configuration_id = 4261) AND (scan_result_policies.project_id = 43110842))
Buffers: shared hit=4 read=3
I/O Timings: read=7.941 write=0.000
-> Index Scan using idx_software_license_policies_unique_on_project_and_scan_policy on public.software_license_policies (cost=0.42..14.76 rows=1 width=40) (actual time=8.544..257.659 rows=456 loops=1)
Index Cond: ((software_license_policies.project_id = 43110842) AND (software_license_policies.scan_result_policy_id = scan_result_policies.id))
Buffers: shared hit=145 read=306 dirtied=185
I/O Timings: read=249.603 write=0.000
Time: 268.676 ms
- planning: 2.321 ms
- execution: 266.355 ms
- I/O read: 257.544 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 149 (~1.20 MiB) from the buffer pool
- reads: 309 (~2.40 MiB) from the OS file cache, including disk I/O
- dirtied: 185 (~1.40 MiB)
- writes: 0
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.