Improve notes query performance in keep-around orphan raketask
What does this MR do and why?
The notes query had been timing out. So here we modify the query in order to better make use of the indexes available on this table.
root@console-01-sv-gstg.c.gitlab-staging-1.internal:/tmp# gitlab-rake gitlab:keep_around:orphaned FILENAME=/tmp/orphaned.csv PROJECT_PATH=gitlab-org/gitlab
I, [2024-09-05T13:33:19.831976 #3092351] INFO -- : Finding keep-around references...
I, [2024-09-05T13:33:24.206348 #3092351] INFO -- : Found 131640 keep-around references
I, [2024-09-05T13:33:24.206475 #3092351] INFO -- : Checking pipeline shas...
I, [2024-09-05T13:33:54.699204 #3092351] INFO -- : Checking merge request shas...
I, [2024-09-05T13:33:54.929971 #3092351] INFO -- : Checking merge request diff shas...
I, [2024-09-05T13:34:02.066463 #3092351] INFO -- : Checking diff note shas...
rake aborted!
ActiveRecord::QueryCanceled: PG::QueryCanceled: ERROR: canceling statement due to statement timeout
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `block in read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:65:in `read'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:110:in `read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:48:in `select_all'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/gems/activerecord-gitlab/lib/active_record/gitlab_patches/rescue_from.rb:31:in `exec_queries'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:86:in `add_diff_note_shas'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:34:in `block (4 levels) in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:130:in `block in create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `open'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:16:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:25:in `load'
/opt/gitlab/embedded/bin/bundle:25:in `<main>'
Caused by:
PG::QueryCanceled: ERROR: canceling statement due to statement timeout
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:111:in `block in read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:65:in `read'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:110:in `read_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:48:in `select_all'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:96:in `method_missing'
/opt/gitlab/embedded/service/gitlab-rails/gems/activerecord-gitlab/lib/active_record/gitlab_patches/rescue_from.rb:31:in `exec_queries'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:86:in `add_diff_note_shas'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:34:in `block (4 levels) in <top (required)>'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:130:in `block in create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `open'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:129:in `create_csv'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/keep_around.rake:16:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:25:in `load'
/opt/gitlab/embedded/bin/bundle:25:in `<main>'
Tasks: TOP => gitlab:keep_around:orphaned
(See full trace by running task with --trace)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
- Run the raketask (updating the project path to match a project in your GDK):
$ bundle exec rake gitlab:keep_around:orphaned FILENAME=orphaned.csv PROJECT_PATH=root/cool-project I, [2024-09-09T10:49:31.940291 #968726] INFO -- : Finding keep-around references... I, [2024-09-09T10:49:31.975329 #968726] INFO -- : Found 7 keep-around references I, [2024-09-09T10:49:31.975384 #968726] INFO -- : Checking pipeline shas... I, [2024-09-09T10:49:32.031617 #968726] INFO -- : Checking merge request shas... I, [2024-09-09T10:49:32.051109 #968726] INFO -- : Checking merge request diff shas... I, [2024-09-09T10:49:32.080371 #968726] INFO -- : Checking note shas... W, [2024-09-09T10:49:32.080438 #968726] WARN -- : System notes will not be included. I, [2024-09-09T10:49:32.127244 #968726] INFO -- : Checking sent notification shas... I, [2024-09-09T10:49:32.135469 #968726] INFO -- : Checking todo shas... I, [2024-09-09T10:49:32.153606 #968726] INFO -- : Summary: I, [2024-09-09T10:49:32.153662 #968726] INFO -- : Keep-around references: 7 I, [2024-09-09T10:49:32.153693 #968726] INFO -- : Potentially orphaned: 2 I, [2024-09-09T10:49:32.153715 #968726] INFO -- : Writing CSV... I, [2024-09-09T10:49:32.153797 #968726] INFO -- : Keep-around orphan report complete
- Check the contents of the report CSV (the first column is the reference name, the second column is how many DB records point to it):
$ cat orphaned.csv keep_around,count refs/keep-around/02c1656bfc38a697a89d76bbfe29f3eb88b87ffc,1 refs/keep-around/236fec6774788de068fb63e62e98971aa511a8ab,2 refs/keep-around/5293981aeeea286cc1d0acee6e472172845ca543,5 refs/keep-around/875bae757a1823ad9668b9952791f55fb9448f6b,0 refs/keep-around/aeec65ae9a7eb256b4998d90126e0e03af7b77fe,0 refs/keep-around/d2f609ed3f07d0efc8fc557044795743ab334527,2 refs/keep-around/f80ec752b6bc8db25bb71d77395a7129469872c3,1
- Do something that would create a new keep-around. E.g. create a new merge request by updating something in the UI.
- Re-run the report and note that there are now differences
🎉
Edited by James Fargher