Skip to content

Improve notes query performance in keep-around orphan raketask

James Fargher requested to merge improve_keep_around_queries into master

What does this MR do and why?

#475246

The notes query had been timing out. So here we modify the query in order to better make use of the indexes available on this table.

root@console-01-sv-gstg.c.gitlab-staging-1.internal:/tmp# gitlab-rake gitlab:keep_around:orphaned FILENAME=/tmp/orphaned.csv PROJECT_PATH=gitlab-org/gitlab
I, [2024-09-05T13:33:19.831976 #3092351]  INFO -- : Finding keep-around references...
I, [2024-09-05T13:33:24.206348 #3092351]  INFO -- : Found 131640 keep-around references
I, [2024-09-05T13:33:24.206475 #3092351]  INFO -- : Checking pipeline shas...
I, [2024-09-05T13:33:54.699204 #3092351]  INFO -- : Checking merge request shas...
I, [2024-09-05T13:33:54.929971 #3092351]  INFO -- : Checking merge request diff shas...
I, [2024-09-05T13:34:02.066463 #3092351]  INFO -- : Checking diff note shas...
rake aborted!
ActiveRecord::QueryCanceled: PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `block in read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:65:in `read'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:110:in `read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:48:in `select_all'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/gems/activerecord-gitlab/lib/active_record/gitlab_patches/rescue_from.rb:31:in `exec_queries'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:86:in `add_diff_note_shas'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:34:in `block (4 levels) in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:130:in `block in create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `open'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:16:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:25:in `load'
/opt/gitlab/embedded/bin/bundle:25:in `<main>'

Caused by:
PG::QueryCanceled: ERROR:  canceling statement due to statement timeout
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `block in read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:65:in `read'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:110:in `read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:48:in `select_all'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/gems/activerecord-gitlab/lib/active_record/gitlab_patches/rescue_from.rb:31:in `exec_queries'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:86:in `add_diff_note_shas'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:34:in `block (4 levels) in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:130:in `block in create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `open'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:16:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:25:in `load'
/opt/gitlab/embedded/bin/bundle:25:in `<main>'
Tasks: TOP => gitlab:keep_around:orphaned
(See full trace by running task with --trace)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

  1. Run the raketask (updating the project path to match a project in your GDK):
    $ bundle exec rake gitlab:keep_around:orphaned FILENAME=orphaned.csv PROJECT_PATH=root/cool-project
    I, [2024-09-09T10:49:31.940291 #968726]  INFO -- : Finding keep-around references...
    I, [2024-09-09T10:49:31.975329 #968726]  INFO -- : Found 7 keep-around references
    I, [2024-09-09T10:49:31.975384 #968726]  INFO -- : Checking pipeline shas...
    I, [2024-09-09T10:49:32.031617 #968726]  INFO -- : Checking merge request shas...
    I, [2024-09-09T10:49:32.051109 #968726]  INFO -- : Checking merge request diff shas...
    I, [2024-09-09T10:49:32.080371 #968726]  INFO -- : Checking note shas...
    W, [2024-09-09T10:49:32.080438 #968726]  WARN -- : System notes will not be included.
    I, [2024-09-09T10:49:32.127244 #968726]  INFO -- : Checking sent notification shas...
    I, [2024-09-09T10:49:32.135469 #968726]  INFO -- : Checking todo shas...
    I, [2024-09-09T10:49:32.153606 #968726]  INFO -- : Summary:
    I, [2024-09-09T10:49:32.153662 #968726]  INFO -- : 	Keep-around references: 7
    I, [2024-09-09T10:49:32.153693 #968726]  INFO -- : 	Potentially orphaned: 2
    I, [2024-09-09T10:49:32.153715 #968726]  INFO -- : Writing CSV...
    I, [2024-09-09T10:49:32.153797 #968726]  INFO -- : Keep-around orphan report complete
  2. Check the contents of the report CSV (the first column is the reference name, the second column is how many DB records point to it):
    $ cat orphaned.csv 
    keep_around,count
    refs/keep-around/02c1656bfc38a697a89d76bbfe29f3eb88b87ffc,1
    refs/keep-around/236fec6774788de068fb63e62e98971aa511a8ab,2
    refs/keep-around/5293981aeeea286cc1d0acee6e472172845ca543,5
    refs/keep-around/875bae757a1823ad9668b9952791f55fb9448f6b,0
    refs/keep-around/aeec65ae9a7eb256b4998d90126e0e03af7b77fe,0
    refs/keep-around/d2f609ed3f07d0efc8fc557044795743ab334527,2
    refs/keep-around/f80ec752b6bc8db25bb71d77395a7129469872c3,1
  3. Do something that would create a new keep-around. E.g. create a new merge request by updating something in the UI.
  4. Re-run the report and note that there are now differences 🎉
Edited by James Fargher

Merge request reports

Loading