Improving LFK DeletedRecords batch fetching query
What does this MR do and why?
This MR improves the query that loads the LooseForeignKeys::DeletedRecord
query. At the moment the query is not efficient when loading the records, which causes the LooseForeignKeys::CleanupWorker
to hit the 30 seconds time limit without processing many records. At the moment the deleted records in the ci
database are piling up fast. We have > 30M records waiting to be processed.
Addressing: #419119 (closed)
Changelog: fixed
Background
You can read about the discussion that lead to this solution in this thread. We decided to do a UNION
query that reads from each partition alone, which makes the query much faster.
Queries comparison
I have chosen to query based the records for the table public.ci_runners
in these examples because it still has a lot of records in the current ci
snapshot, but it's not the majority. Most of the records belong to public.p_ci_builds
.
Old Query
SELECT "loose_foreign_keys_deleted_records".*, "loose_foreign_keys_deleted_records"."partition" AS partition_number FROM "loose_foreign_keys_deleted_records" WHERE "loose_foreign_keys_deleted_records"."fully_qualified_table_name" = 'public.ci_runners' AND "loose_foreign_keys_deleted_records"."status" = 1 ORDER BY "partition" ASC, "loose_foreign_keys_deleted_records"."consume_after" ASC, "loose_foreign_keys_deleted_records"."id" ASC LIMIT 1000
https://postgres.ai/console/gitlab/gitlab-production-ci/sessions/21176/commands/69144
New Query using UNION
- Loading the list of the partitions
This is reading from a VIEW
SELECT "postgres_partitions".* FROM "postgres_partitions" WHERE "postgres_partitions"."parent_identifier" = 'public.loose_foreign_keys_deleted_records' ORDER BY "postgres_partitions"."name" ASC
https://postgres.ai/console/gitlab/gitlab-production-ci/sessions/21176/commands/69137
- Loading the records
See the query on postgres.ai (it's long)
This query was created using Rails Console connected to a postgres.ai instance. Using the the code snippet while connected to the ci
database.
LooseForeignKeys::DeletedRecord.using_connection(Ci::ApplicationRecord.connection) do
LooseForeignKeys::DeletedRecord.load_batch_for_table('public.ci_runners', 1000).to_sql
end
https://postgres.ai/console/gitlab/gitlab-production-ci/sessions/21176/commands/69168
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #409836 (closed)