Geo: Avoid PG statement timeout during `Geo::MetricsUpdateWorker`
What does this MR do and why?
Avoids a bad MIN query plan during batch count.
This actually completely cuts out the MIN query logic during batch count. So batch counts will sometimes take longer overall than they otherwise should, since more batches will be counted. But query timeouts due to PG selecting a bad query plan on the MIN query will be completely avoided.
This is a two-way door decision: We can easily revert this if someone finds a better solution.
Resolves #442259 (closed)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
- Set up Geo if you haven't already
- Visit Admin > Geo > Sites
- The page should regularly update the status of all Geo sites
No behavior change is expected. I was unable to reproduce the bad query plan selection locally.
I confirmed in Rails console on the secondary Geo site that the MIN query is avoided with this change:
Before
[3] pry(main)> Geo::JobArtifactReplicator.verified_count
Geo::JobArtifactRegistry Minimum (1.8ms) SELECT MIN("job_artifact_registry"."id") FROM "job_artifact_registry" WHERE ("job_artifact_registry"."state" IN (2)) AND ("job_artifact_registry"."verification_state" IN (2)) /*application:console,db_config_name:geo,console_hostname:MikesGitLabMBP.localdomain,console_username:mkozonogitlab,line:/lib/gitlab/database/batch_counter.rb:123:in `actual_start'*/
Geo::JobArtifactRegistry Maximum (0.2ms) SELECT MAX("job_artifact_registry"."id") FROM "job_artifact_registry" WHERE ("job_artifact_registry"."state" IN (2)) AND ("job_artifact_registry"."verification_state" IN (2)) /*application:console,db_config_name:geo,console_hostname:MikesGitLabMBP.localdomain,console_username:mkozonogitlab,line:/lib/gitlab/database/batch_counter.rb:127:in `actual_finish'*/
Geo::JobArtifactRegistry Count (382.3ms) SELECT COUNT("job_artifact_registry"."id") FROM "job_artifact_registry" WHERE ("job_artifact_registry"."state" IN (2)) AND ("job_artifact_registry"."verification_state" IN (2)) AND "job_artifact_registry"."id" >= $1 AND "job_artifact_registry"."id" < $2 /*application:console,db_config_name:geo,console_hostname:MikesGitLabMBP.localdomain,console_username:mkozonogitlab,line:/lib/gitlab/database/batch_counter.rb:72:in `count_with_timeout'*/ [["id", 1102], ["id", 38501]]
=> 31842
After
[5] pry(main)> Geo::JobArtifactReplicator.verified_count
Geo::JobArtifactRegistry Maximum (0.5ms) SELECT MAX("job_artifact_registry"."id") FROM "job_artifact_registry" WHERE ("job_artifact_registry"."state" IN (2)) AND ("job_artifact_registry"."verification_state" IN (2)) /*application:console,db_config_name:geo,console_hostname:MikesGitLabMBP.localdomain,console_username:mkozonogitlab,line:/lib/gitlab/database/batch_counter.rb:127:in `actual_finish'*/
Geo::JobArtifactRegistry Count (178.8ms) SELECT COUNT("job_artifact_registry"."id") FROM "job_artifact_registry" WHERE ("job_artifact_registry"."state" IN (2)) AND ("job_artifact_registry"."verification_state" IN (2)) AND "job_artifact_registry"."id" >= $1 AND "job_artifact_registry"."id" < $2 /*application:console,db_config_name:geo,console_hostname:MikesGitLabMBP.localdomain,console_username:mkozonogitlab,line:/lib/gitlab/database/batch_counter.rb:72:in `count_with_timeout'*/ [["id", 0], ["id", 38501]]
=> 30737