Separate worker notes on bitbucket server
-
Please check this box if this contribution uses AI-generated content as outlined in the GitLab DCO & CLA
What does this MR do and why?
Separate worker for pull request notes on Bitbucket Server.
Previous (existing implementation)
Currently, Importers::NotesImporter
enqueues per MR
, and then ImportPullRequestNotesWorker (Importers::PullRequestNotesImporter)
makes HTTP calls (using a loop). This means there might be concurrent HTTP calls to Bitbucket Server from multiple MR.
New changes (this MR)
Move HTTP Calls
This MR changes the behavior. So, Importers::NotesImporter
still loops the MR and makes HTTP calls to retrieve activities. The enqueue to ImportPullRequestNotesWorker (Importers::PullRequestNotesImporter)
is on the activity
level.
This means we can easily control HTTP calls in Importers::NotesImporter
.
Separate Importers Logic
This MR also separates Importers::PullRequestNotesImporter
logic into several classes (still using the same worker ImportPullRequestNotesWorker
):
Importers::PullRequestNotes::ApprovedEvent
Importers::PullRequestNotes::Inline
Importers::PullRequestNotes::MergeEvent
Importers::PullRequestNotes::StandalonePr
Add Paging on HTTP Calls
This MR adds Gitlab::Import::PageCounter
so that HTTP calls on client.activities
can resume from the last page upon interruption.
Notes
- This MR intention is more towards reducing concurrent HTTP calls to the Bitbucket Server. In terms of speed:
- If the Bitbucket Server can handle "unlimited" requests, this MR will slow down the importing process. Reducing concurrent HTTP calls means less processing on the GitLab side, although we separate the processing into per
activity
level (network latency on remote server > GitLab DB latency). - If the Bitbucket Server has limited capacity, this MR should help lessen the burden on the Bitbucket Server.
- If the Bitbucket Server can handle "unlimited" requests, this MR will slow down the importing process. Reducing concurrent HTTP calls means less processing on the GitLab side, although we separate the processing into per
-
ObjectImporter
inbitbucket_server_import
has not handled conversion to the "representation" object yet. So, the 4 new classes created above were copy-pasting from the previous implementationImporters::PullRequestNotesImporter
but changed the "object style" into "hash style". - Haven't handled the Bitbucket (non Server) yet.
- With these changes, there is a risk that a note might reach the sidekiq byte size limit. Similar with this issue.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
MR list: | |
Approve event: | |
Standalone PR: | |
Inline: | |
Calls the new class: | |
Resume from the last page upon interruption: | |
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
- Initial environment setup can follow this
- Prepare some data in Bitbucket Server
- Patch the code to inject interruption:
diff --git a/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb b/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb
index 99876c779539..06bd0c6b92cd 100644
--- a/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb
+++ b/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb
@@ -64,6 +64,9 @@ def validate_url_with_proxy!(
)
# rubocop:enable Metrics/ParameterLists
+ allow_localhost = true
+ allow_local_network = true
+
return Result.new(nil, nil, true) if url.nil?
raise ArgumentError, 'The schemes is a required argument' if schemes.blank?
diff --git a/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb b/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb
index e7c1c91ae970..744a976975ed 100644
--- a/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb
+++ b/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb
@@ -22,6 +22,15 @@ def execute
"page #{page} using batch-size #{concurrent_import_jobs_limit}"
)
+ if merge_request.iid == 2
+ Gitlab::Redis::SharedState.with do |redis|
+ temp_key = 'test-bitbucket-pr'
+ temp_counter = redis.incr(temp_key)
+ redis.expire(temp_key, 5.minutes)
+ raise "purposely interrupt" if temp_counter == 4
+ end
+ end
+
activities = client.activities(
project_key, repository_slug, merge_request.iid,
page_offset: page, limit: concurrent_import_jobs_limit
@@ -131,6 +140,10 @@ def parent_collection
def page_counter_id(merge_request)
"merge_request/#{merge_request.id}/#{collection_method}"
end
+
+ def concurrent_import_jobs_limit
+ 2
+ end
end
end
end
- Tail the log file:
tail -f log/importer.log
- Go to
http://127.0.0.1:3000/import/bitbucket_server/status
- Click Import