Process one record at a time in Bulk Import pipelines
What does this MR do?
This MR:
- Updates Bulk Import ETL pipelines to process 1 record at a time, instead of operating on a whole collection at once. This removes a bit of complexity from a lot of places (transformers and loaders), since there is no need to loop through the whole collection
- Adds
ExtractedData
object to wrap raw hash data from GraphQL for easier use in the pipelines - Removes hash digger transformer, since there is no need in it anymore
- Removes underscorify transformer in order to utilize GraphQL aliasing ability instead
To test
- Seed your local environment with groups via rake task
bundle exec rake "gitlab:seed:group_seed[3,root]"
- Copy name & path of top level group that was generated
- Open rails console and run (replace with your values)
Feature.enable(:bulk_import)
rand = (1..1000).to_a.sample
user = User.first
credentials = { url: 'http://gdk.test:3000', access_token: <api scope token> }
params = [{ source_type: 'group_entity', source_name: '<source group name>', source_full_path: '<source group path>', destination_name: "foo#{rand}", destination_namespace: 'root' }]
BulkImportService.new(user, params, credentials).execute
bulk_import = BulkImport.last
bulk_import.finished?
Alternatively you can import the group via UI by opening Import tab in '/groups/new' page.
- Wait for bulk import to finish (this requires sidekiq to be running)
- Once finished, verify a new group (and it's subgroups) was imported under root namespace
Mentions #299527 (closed)
Screenshots (strongly suggested)
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Edited by George Koltsov