Add compound index for `vulnerabilities` table on `project_id` and `id`
What does this MR do and why?
Related to #350915 (closed), and !78960 (merged).
Database review
This MR introduces a new compound index for the vulnerabilities
table on project_id
and id
columns to speed up the SQL queries generated by the EachBatch
module.
rake db:migrate:up
== 20220125230538 AddCompoundIndexOnProjectIdAndIdForVulnerabilities: migrating
-- transaction_open?()
-> 0.0000s
-- index_exists?(:vulnerabilities, [:project_id, :id], {:name=>"index_vulnerabilities_on_project_id_and_id", :algorithm=>:concurrently})
-> 0.0057s
-- add_index(:vulnerabilities, [:project_id, :id], {:name=>"index_vulnerabilities_on_project_id_and_id", :algorithm=>:concurrently})
-> 0.0047s
== 20220125230538 AddCompoundIndexOnProjectIdAndIdForVulnerabilities: migrated (0.0116s)
rake db:migrate:down
== 20220125230538 AddCompoundIndexOnProjectIdAndIdForVulnerabilities: reverting
-- transaction_open?()
-> 0.0000s
-- index_exists?(:vulnerabilities, [:project_id, :id], {:name=>"index_vulnerabilities_on_project_id_and_id", :algorithm=>:concurrently})
-> 0.0064s
-- execute("SET statement_timeout TO 0")
-> 0.0005s
-- remove_index(:vulnerabilities, {:name=>"index_vulnerabilities_on_project_id_and_id", :algorithm=>:concurrently, :column=>[:project_id, :id]})
-> 0.0080s
-- execute("RESET statement_timeout")
-> 0.0005s
== 20220125230538 AddCompoundIndexOnProjectIdAndIdForVulnerabilities: reverted (0.0188s)
Example queries used by the EachBatch module
Get the first vulnerability ID
SELECT
"vulnerabilities"."id"
FROM
"vulnerabilities"
WHERE
"vulnerabilities"."project_id" = 278964
ORDER BY
"vulnerabilities"."id" ASC
LIMIT 1
Before the index;
https://gitlab.slack.com/archives/CLJMDRD8C/p1643153060294400
Limit (cost=0.44..29.35 rows=1 width=8) (actual time=4.948..4.950 rows=1 loops=1)
Buffers: shared read=4
I/O Timings: read=4.866 write=0.000
-> Index Scan using vulnerabilities_pkey on public.vulnerabilities (cost=0.44..2381387.85 rows=82370 width=8) (actual time=4.945..4.946 rows=1 loops=1)
Filter: (vulnerabilities.project_id = 278964)
Rows Removed by Filter: 0
Buffers: shared read=4
I/O Timings: read=4.866 write=0.000
After the index;
https://gitlab.slack.com/archives/CLJMDRD8C/p1643153457300500
Limit (cost=0.56..0.67 rows=1 width=8) (actual time=0.188..0.189 rows=1 loops=1)
Buffers: shared hit=4 read=4
I/O Timings: read=0.071 write=0.000
-> Index Only Scan using index_vulnerabilities_on_project_id_and_id on public.vulnerabilities (cost=0.56..8741.89 rows=82009 width=8) (actual time=0.187..0.187 rows=1 loops=1)
Index Cond: (vulnerabilities.project_id = 278964)
Heap Fetches: 0
Buffers: shared hit=4 read=4
I/O Timings: read=0.071 write=0.000
Get the IDs of the next 1000 vulnerabilities
SELECT
"vulnerabilities"."id"
FROM
"vulnerabilities"
WHERE
"vulnerabilities"."project_id" = 278964
AND "vulnerabilities"."id" >= 1
ORDER BY
"vulnerabilities"."id" ASC
LIMIT 1 OFFSET 1000
Before the index;
https://gitlab.slack.com/archives/CLJMDRD8C/p1643153157296900
Limit (cost=28288.53..28315.81 rows=1 width=8) (actual time=104755.558..105109.059 rows=1 loops=1)
Buffers: shared hit=1729611 read=379726 dirtied=15139 written=385
I/O Timings: read=259374.607 write=18.595
-> Gather Merge (cost=1000.46..2248718.36 rows=82370 width=8) (actual time=25.852..105108.855 rows=1001 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1729611 read=379726 dirtied=15139 written=385
I/O Timings: read=259374.607 write=18.595
-> Parallel Index Scan using vulnerabilities_pkey on public.vulnerabilities (cost=0.44..2238210.80 rows=34321 width=8) (actual time=7.356..88541.727 rows=336 loops=3)
Index Cond: (vulnerabilities.id >= 1)
Filter: (vulnerabilities.project_id = 278964)
Rows Removed by Filter: 980598
Buffers: shared hit=1729611 read=379726 dirtied=15139 written=385
I/O Timings: read=259374.607 write=18.595
After the index;
https://gitlab.slack.com/archives/CLJMDRD8C/p1643153483303000
Limit (cost=109.65..109.76 rows=1 width=8) (actual time=0.787..0.788 rows=1 loops=1)
Buffers: shared hit=205 read=8
I/O Timings: read=0.188 write=0.000
-> Index Only Scan using index_vulnerabilities_on_project_id_and_id on public.vulnerabilities (cost=0.56..8946.91 rows=82009 width=8) (actual time=0.063..0.732 rows=1001 loops=1)
Index Cond: ((vulnerabilities.project_id = 278964) AND (vulnerabilities.id >= 1))
Heap Fetches: 32
Buffers: shared hit=205 read=8
I/O Timings: read=0.188 write=0.000
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Mehmet Emin INAC