Lots of DB queries in ElasticIndexBulkCronWorker
Problem
You can see from the logs (internal only) that there are 30k-50k queries happening per run of this worker at the moment.
The high load on this worker at the moment is mostly due to the ongoing migration in !56177 (merged) . In general we thought adding 9k jobs per 3 minutes would be fine since we can process up to 16k updates per minute.
What we're seeing is that the worker is only just keeping up and sometimes falling behind because the total time taken is often more than 2 minutes. There is actually more time spent on DB queries than Elasticsearch requests so optimizing database queries should help quite a bit. There is also more time spent in CPU which may or may not be helped by optimizing DB queries. The CPU time may be marshalling objects in DB -> Ruby -> JSON and we should evaluate that separately but the database query performance is probably more important since it is so high volume and can impact other services.
Solution
Add more preloads to load dependencies. We should start with notes
because we are indexing 9k notes per 3 mins at the moment and probably take up the bulk of the queries. All these queries will be happening in the as_indexed_json
methods for the types. I have already identified that notes will execute extra queries to load the related noteable
as well as noteable.assignee_ids
so we can probably make a big improvement with just preloading that.