Introduce simple ActiveRecord-based bulk-insert functionality
What does this MR do?
Adds support for bulk-inserting AR models safely.
References: #196844 (closed)
New bulk insertion API
Bulk insertions are crucial for storing large amounts of data efficiently. However, we also identified the need for this to happen in a safe manner, i.e. by ensuring bulk insertions are only available when we can have certain guarantees that we are not causing integrity problems or violate business rules (often encoded in ActiveRecord validations.)
This MR extends on !24168 (merged) in the following ways:
BulkInsertSafe.[bulk_insert|bulk_insert!]
These two new methods operate on sequences of ActiveRecord objects. They behave similarly to save
and save!
in the sense that they run validations and either return a boolean
indicating success or raise an exception. This ensures that we won't be writing data which would not pass if they were instead inserted via save
or similar built-ins.
Internally these calls rely on ActiveRecord 6's new InsertAll
type, which inserts hashes in bulk, but does not run validations. This and the fact that validations are run are the primary differences to the existing Database.bulk_insert
helper.
Note that as of !24168 (merged) you can only access this functionality if (as the name suggests) your target model type is considered "safe for bulk insertion"; these rules are currently fairly simple and prevent certain callbacks from being registered, but can be easily expanded on in the future.
The bulk_insert
method takes the following arguments:
-
items
(required): ActiveRecord instances to be inserted -
:batch_size
(optional, default500
): Maximum amount of rows that will be inserted simultaneously -
:validate
(optional, defaulttrue
): Boolean that allows to bypass validations (for instance when you run them outside of this call) -
&handle_attributes
(optional): A block that will be invoked for every attribute hash about to be inserted (this allows callers to inject or transform rows before insertion)
Code example:
class LabelLink < ApplicationRecord
include BulkInsertSafe
end
label_links = ... # build some label links
LabelLink.bulk_insert(label_links, batch_size: 100)
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
-
Documentation (if required) -- Added RDoc and will add dev docs in #207993 (closed) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Migration path
Since this new API extends on existing bulk-insert functionality in several ways, we should establish:
- whether it can fully replace
Database.bulk_insert
- or whether it should live alongside it (considering it operates on AR instances, not row hashes)
- or whether we should first migrate to
insert_all
everywhere