Add a Zoekt SchedulingWorker
What does this MR do and why?
Create a worker Search::Zoekt::SchedulingWorker
which performs some tasks. By default the task
is set to :initiate
which calls the worker for each SUPPORTED_TASKS
. Right now we have only one supported task which is :node_assignment
. This :node_assignment
task iterates over each record of Search::Zoekt::EnabledNamespace
which doesn't have a corresponding Search::Zoekt::Index
record. It assigns each record of Search::Zoekt::EnabledNamespace
to the Node
sorted by descending order of free space. We are taking the buffer of 3x
of the total repository size and the watermark limit of 80%
. When the node can't be assigned to a namespace it adds an entry in the zoekt.log
This worker is a corn worker that will run after every 10 minutes
This feature is guided by this feature flag zoekt_scheduling_worker
Notes for database reviewers
SELECT
"zoekt_enabled_namespaces".*
FROM
"zoekt_enabled_namespaces"
LEFT OUTER JOIN "zoekt_indices" ON "zoekt_indices"."zoekt_enabled_namespace_id" = "zoekt_enabled_namespaces"."id"
WHERE
"zoekt_indices"."zoekt_enabled_namespace_id" IS NULL
Query plan: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/25195/commands/80040
SELECT
"zoekt_nodes".*
FROM
"zoekt_nodes"
ORDER BY
total_bytes - used_bytes DESC
Query plan: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/25195/commands/80048
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Make sure the
Zoekt
is set up - Make sure there are no pending DB migrations
bin/rails db:migrate
- Open the rails console
bin/rails c
- Run the following code and verify that it gives at least one record:
Search::Zoekt::EnabledNamespace.with_missing_indices
- If you don't get any records, then create some records like the following
Namespace.last(3).each { |n| Search::Zoekt::EnabledNamespace.create! root_namespace_id: n.root_ancestor }
- Now again verify with step 4 that you have 3 records
- Just for testing make the worker call synchronous
--- a/ee/app/workers/search/zoekt/scheduling_worker.rb
+++ b/ee/app/workers/search/zoekt/scheduling_worker.rb
@@ -37,7 +37,7 @@ def supported_tasks
end
def initiate
- TASKS.each { |task| with_context(related_class: self.class) { self.class.perform_async(task) } }
+ TASKS.each { |task| with_context(related_class: self.class) { self.class.new.perform(task) } }
end
- Tails the
zoekt.log
in a new terminal tab
tail -f log/zoekt.log`
- Enable the feature flag
zoekt_scheduling_worker
Feature.enable(:zoekt_scheduling_worker)
- Now run the worker
Search::Zoekt::SchedulingWorker.new.perform
- Now verify that you get no records with step 4
- If you still get records with the above step. Check the log, there must be some entry
- If the entry message
RootStorageStatistics is not available
. You need to createRootStorageStatistics
- Run the following code
Namespace.last(3).each { |n| Namespace::RootStorageStatistics.find_or_create_by! namespace_id: n.root_ancestor.id }
- Rerun the worker from step 10.
- Now verify that you should not get any records with step 4
Related to #432693 (closed)