Skip to content

Fix BulkImport pipeline retries

Rodrigo Tomonari requested to merge rodrigo/365131-fix-bulk-import-retry into master

What does this MR do and why?

BulkImport pipeline retries were not working as the pipeline was not raising BulkImports::NetworkError exceptions as they were being rescued by the catchall rescue StandardError.

With this change, the pipeline runner is now handling BulkImports::NetworkError exceptions, and in case the exception is retriable, for example, because of a Net::ReadTimeout error, the pipeline will re-raise the error BulkImports::PipelineRetryError so the PipelineWorker can rescue it and retry the worker.

Besides, the max try count is increased to 10 since there is no problem in retrying a few more times before marking the pipeline as failed.

Related to: Make BulkImport to handle Net::ReadTimeout (#365131 - closed)

Screenshots or screen recordings

Retry

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

To test the retry mechanism, we need to simulate a retriable error to occur or make the API return a 429 status.

To simulate a Net::ReadTimeout, we can add sleep 60 to one of the API actions used by BulkImport, for example, add a sleep in the GraphQL API for the group.

Changing the GroupResolver like the code below will make the endpoint timeout in the first 3 attempts.

https://gitlab.com/gitlab-org/gitlab/-/blob/72e0329c8d86e804f9ea152b590a73ce591a003e/app/graphql/resolvers/group_resolver.rb#L4

module Resolvers
  class GroupResolver < BaseResolver
    prepend FullPathResolver

    type Types::GroupType, null: true

    def resolve(full_path:)
      if Gitlab::Cache::Import::Caching.increment('groups_timeout', timeout: 10.minutes) < 3
        sleep 60
      end

      model_by_full_path(Group, full_path)
    end
  end
end
  1. Feature.enable(:bulk_import).
  2. Create a top-level group.
  3. Go to /groups/new#import-group-pane page and enter the instance URL and access token (needs to be api & read_repository scope).
  4. Select the newly created group and click Import.
  5. Wait for Group import to complete and verify the imported group data.

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Rodrigo Tomonari

Merge request reports

Loading