Skip to content

Add present on default branch to project_id, id index

What does this MR do and why?

Describe in detail what your merge request does and why.

This Merge Request adds the present_on_default_branch attribute to the project_id, id index on the vulnerabilities table.

The present_on_default_branch field will be used on most queries, and the vulnerabilities indices should be rearranged.

More context can be found on this comment and this issue

This Merge Request is related to the issue #368323 (closed)

The original index index_vulnerabilities_on_project_id_and_id was added on MR !79171 (merged) to speed up the SQL queries generated by the EachBatch module.

Migration Up

main: == 20220614185644 UpdateIndexVulnerabilitiesProjectIdId: migrating ============
main: -- transaction_open?()
main:    -> 0.0000s
main: -- index_exists?(:vulnerabilities, [:project_id, :id], {:where=>"present_on_default_branch IS TRUE", :name=>"index_vulnerabilities_project_id_and_id_on_default_branch", :algorithm=>:concurrently})
main:    -> 0.0152s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0004s
main: -- add_index(:vulnerabilities, [:project_id, :id], {:where=>"present_on_default_branch IS TRUE", :name=>"index_vulnerabilities_project_id_and_id_on_default_branch", :algorithm=>:concurrently})
main:    -> 0.0047s
main: -- execute("RESET statement_timeout")
main:    -> 0.0003s
main: -- transaction_open?()
main:    -> 0.0000s
main: -- indexes(:vulnerabilities)
main:    -> 0.0112s
main: -- remove_index(:vulnerabilities, {:algorithm=>:concurrently, :name=>"index_vulnerabilities_on_project_id_and_id"})
main:    -> 0.0037s
main: == 20220614185644 UpdateIndexVulnerabilitiesProjectIdId: migrated (0.0486s) ===

Migration Down

main: == 20220614185644 UpdateIndexVulnerabilitiesProjectIdId: reverting ============
main: -- transaction_open?()
main:    -> 0.0000s
main: -- index_exists?(:vulnerabilities, [:project_id, :id], {:name=>"index_vulnerabilities_on_project_id_and_id", :algorithm=>:concurrently})
main:    -> 0.0149s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0004s
main: -- add_index(:vulnerabilities, [:project_id, :id], {:name=>"index_vulnerabilities_on_project_id_and_id", :algorithm=>:concurrently})
main:    -> 0.0033s
main: -- execute("RESET statement_timeout")
main:    -> 0.0003s
main: -- transaction_open?()
main:    -> 0.0000s
main: -- indexes(:vulnerabilities)
main:    -> 0.0110s
main: -- remove_index(:vulnerabilities, {:algorithm=>:concurrently, :name=>"index_vulnerabilities_project_id_and_id_on_default_branch"})
main:    -> 0.0024s
main: == 20220614185644 UpdateIndexVulnerabilitiesProjectIdId: reverted (0.0419s) ===

Example queries used by the EachBatch module

Get the first vulnerability ID
SELECT
    "vulnerabilities"."id"
FROM
    "vulnerabilities"
WHERE
    "vulnerabilities"."project_id" = 278964
    AND "present_on_default_branch" = TRUE
ORDER BY
    "vulnerabilities"."id" ASC
LIMIT 1

Before the index;

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10590/commands/38068

 Limit  (cost=0.56..1.99 rows=1 width=8) (actual time=8.415..8.417 rows=1 loops=1)
   Buffers: shared hit=3 read=5
   I/O Timings: read=8.294 write=0.000
   ->  Index Scan using index_vulnerabilities_on_project_id_and_id on public.vulnerabilities  (cost=0.56..122769.31 rows=85891 width=8) (actual time=8.412..8.413 rows=1 loops=1)
         Index Cond: (vulnerabilities.project_id = 278964)
         Filter: vulnerabilities.present_on_default_branch
         Rows Removed by Filter: 0
         Buffers: shared hit=3 read=5
         I/O Timings: read=8.294 write=0.000

After the index;

Warm cache

https://explain.depesz.com/s/kuVY

  Limit  (cost=0.56..1.99 rows=1 width=8) (actual time=0.062..0.063 rows=1 loops=1)
   Buffers: shared hit=8
   I/O Timings: read=0.000 write=0.000
   ->  Index Scan using index_vulnerabilities_on_project_id_and_id on public.vulnerabilities  (cost=0.56..135706.96 rows=95371 width=8) (actual time=0.060..0.061 rows=1 loops=1)
         Index Cond: (vulnerabilities.project_id = 278964)
         Filter: vulnerabilities.present_on_default_branch
         Rows Removed by Filter: 0
         Buffers: shared hit=8
         I/O Timings: read=0.000 write=0.000
Time: 0.474 ms  
  - planning: 0.388 ms  
  - execution: 0.086 ms  
    - I/O read: 0.000 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 8 (~64.00 KiB) from the buffer pool  
  - reads: 0 from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  
Get the IDs of the next 1000 vulnerabilities
SELECT
    "vulnerabilities"."id"
FROM
    "vulnerabilities"
WHERE
    "vulnerabilities"."project_id" = 278964
    AND "vulnerabilities"."id" >= 1
    AND "present_on_default_branch" = TRUE
ORDER BY
    "vulnerabilities"."id" ASC
LIMIT 1 OFFSET 1000

Before the index;

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/11321/commands/40473

 Limit  (cost=1432.42..1433.85 rows=1 width=8) (actual time=242.930..242.932 rows=1 loops=1)
   Buffers: shared hit=708 read=289 dirtied=13
   I/O Timings: read=235.818 write=0.000
   ->  Index Scan using index_vulnerabilities_on_project_id_and_id on public.vulnerabilities  (cost=0.56..122984.04 rows=85891 width=8) (actual time=0.068..242.777 rows=1001 loops=1)
         Index Cond: ((vulnerabilities.project_id = 278964) AND (vulnerabilities.id >= 1))
         Filter: vulnerabilities.present_on_default_branch
         Rows Removed by Filter: 0
         Buffers: shared hit=708 read=289 dirtied=13
         I/O Timings: read=235.818 write=0.000

After the index;

Cold cache

https://explain.depesz.com/s/Dbn3

Limit  (cost=1426.00..1427.42 rows=1 width=8) (actual time=177.755..177.757 rows=1 loops=1)
   Buffers: shared hit=403 read=474 dirtied=62
   I/O Timings: read=171.751 write=0.000
   ->  Index Scan using index_vulnerabilities_on_project_id_and_id on public.vulnerabilities  (cost=0.56..135945.39 rows=95371 width=8) (actual time=98.822..177.630 rows=1001 loops=1)
         Index Cond: ((vulnerabilities.project_id = 278964) AND (vulnerabilities.id >= 1))
         Filter: vulnerabilities.present_on_default_branch
         Rows Removed by Filter: 0
         Buffers: shared hit=403 read=474 dirtied=62
         I/O Timings: read=171.751 write=0.000
  Time: 180.587 ms  
  - planning: 2.786 ms  
  - execution: 177.801 ms  
    - I/O read: 171.751 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 403 (~3.10 MiB) from the buffer pool  
  - reads: 474 (~3.70 MiB) from the OS file cache, including disk I/O  
  - dirtied: 62 (~496.00 KiB)  
  - writes: 0  
  

Warm cache

https://explain.depesz.com/s/7ChZ

 Limit  (cost=1426.00..1427.42 rows=1 width=8) (actual time=1.455..1.456 rows=1 loops=1)
   Buffers: shared hit=870
   I/O Timings: read=0.000 write=0.000
   ->  Index Scan using index_vulnerabilities_on_project_id_and_id on public.vulnerabilities  (cost=0.56..135945.39 rows=95371 width=8) (actual time=0.057..1.383 rows=1001 loops=1)
         Index Cond: ((vulnerabilities.project_id = 278964) AND (vulnerabilities.id >= 1))
         Filter: vulnerabilities.present_on_default_branch
         Rows Removed by Filter: 0
         Buffers: shared hit=870
         I/O Timings: read=0.000 write=0.000
Time: 1.931 ms  
  - planning: 0.450 ms  
  - execution: 1.481 ms  
    - I/O read: 0.000 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 870 (~6.80 MiB) from the buffer pool  
  - reads: 0 from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Marcos Rocha

Merge request reports

Loading