Adjust delay used to spread jobs in GitHub Import
What does this MR do and why?
The improved_spread_parallel_import
method introduced in !109264 (merged) to change how GitHub Import's jobs are spread ended up making GitHub Import a little slower as all jobs would be enqueued with at least 1-minute delay between each stage. So since 8 stages are impacted by this delay, in general, GitHub Import would take 8 minutes longer to migrate a project.
This change fixes this problem by making the initial delay start in 1 second.
Fixes: #391230 (closed)
MR that introduced the method: !109264 (merged)
Screenshots or screen recordings
How to set up and validate locally
Because most jobs are spread in batches of 1000, the delay is only applied after reading 1000 records from GitHub. So to test, reduce the batch size to a lower number, for example, 10. This way, for every ten jobs enqueued, a delay of 1 minute will be added.
-
Enable GitHub Import in the settings (Admin -> Settings -> General -> Visibility and access controls -> Enable GitHub)
-
Trigger an import via API or UI
curl --location --request POST 'http://gdk.test:3000/api/v4/import/github' \
--header 'Authorization: Bearer <GITLAB ACCESS TOKEN>' \
--header 'Content-Type: application/json' \
--data-raw '{
"personal_access_token": "<GITHUB ACCESS TOKEN>",
"repo_id": "238972",
"target_namespace": "root",
"new_name": "rspec-core",
"optional_stages": {
"single_endpoint_issue_events_import": true,
"single_endpoint_notes_import": true,
"attachments_import": false
}
}'
- Check the delay added to the Sidekiq Jobs using Sidekiq Dashboard
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.