Preload DB records in bulk ES indexing
What does this MR do?
The ProcessBookkeepingService
is responsible for synchronizing all database updates with Elasticsearch. It does this by batching updates in groups of 1000 from a custom redis Queue.
Today it loops through each one calling bulk_indexer.process
which is eventually calling #database_record
for each element. We are intentionally sending them through #process
one at a time because sometimes we want to send them to Elasticsearch in groups smaller than 1000 to ensure we don't send a single request that is too large for Elasticsearch to handle.
As such it isn't really feasible to unwind all the code and just pass arrays all the way through the system. Thus in order to avoid the N query problem we can do a similar trick to rails preloading by implementing our own preloader on a collection of documents.
This MR implements this by creating a new DocumentReference::Collection
class with a #preload_database_records
method that goes ahead and updates each contained DocumentReference
with their corresponding database_record
so that later when the method is invoked it will be memoized.
Since the ProcessBookkeepingService
can handle multiple different types of active records we need to group them by type before performing a single DB query for each type as loading different records from different tables in 1 query is more convoluted.
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry - [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Related to #207280 (closed)