Import more resources using events in GitHub Import
What does this MR do and why?
This change updates GitHub Import to import merged by, review requests, reviews, and comments using GitHub's events API in the IssueEvents stage, making the ImportPullRequestsMergedByWorker
, Stage::ImportPullRequestsReviewRequestsWorker
, Stage::ImportPullRequestsReviewsWorker
, and ImportNotesWorker
obsolete.
In order to re-use the code, the IssueEventImporter class was updated to call importer classes from the obsolete stages.
All these changes are behind github_import_extended_events
feature flag
Related to: #433536 (closed)
Notes
Feature flag
When the feature flag is turned on, the import setting called "extended_events" is also enabled. This setting is used to decide which stages should be executed during the import process. The purpose of this approach is to ensure that enabling or disabling the feature flag doesn't affect the ongoing migrations.
Reviewers
Importing reviewers using GitHub's timeline events isn't straightforward compared to using GitHub's pull request API
Different from the PullRequest API, timeline events do not provide a list of current reviewers for a pull request. Instead, it returns a sequence of events for when a reviewer was added or removed. So, to import the reviewers, during the import process adding and removing reviewers while reading the events would result in the correct list of reviewers to be set. The problem is that GitHub Import enqueue one worker for each event to be imported; therefore, events aren't imported in order.
To address this issue, we maintain a list of all the review_requested
and review_request_removed
events associated with a pull request. Subsequently, a separate process compiles these events and identifies the pull request reviewers.
Import options
With this change, the import setting Import issue and pull request events
is redundant and is removed from the UI.
Import stats
GitHub Import stats for merged_by
, notes
, pull_request_review_request
and pull_request_review
will no longer exist as the resources will be included in the issue_events
stats.
Screenshots or screen recordings
In the UI, one import option is removed when the feature flag is enabled
Before | After |
---|---|
How to set up and validate locally
- Enable
github_import_extended_events
feature flag - Use the script below to create users from a GitHub repository and cache them on Redis. This way, most of the users should be mapped when using a public repository
Script to create users
Use the script below to create GitHub users in your local environment and cache them on Redis
access_token = 'GITHUB_ACCESS_TOKEN'
repo = 'rspec/rspec-core' # E.g rspec/rspec-core
@processed = Set.new
def read_issues(issue)
user = issue.to_hash[:user]
return if @processed.include?(user[:login])
gitlab_user = User.find_by_username(user[:login])
unless gitlab_user
gitlab_user = User.create(
name: user[:login],
username: user[:login],
email: "#{user[:login]}@github.com",
password: '5iveL!fe',
state: 'deactivated',
confirmed_at: Time.now
)
end
# Return if user failed to be created
return unless gitlab_user
key = Gitlab::GithubImport::UserFinder::ID_CACHE_KEY % user[:id]
Gitlab::Cache::Import::Caching.write(key, gitlab_user.id)
key = Gitlab::GithubImport::UserFinder::EMAIL_FOR_USERNAME_CACHE_KEY % user[:login]
Gitlab::Cache::Import::Caching.write(key, gitlab_user.email)
@processed.include?(user[:login])
end
client = Octokit::Client.new(access_token: access_token)
issues = client.issues(repo, state: 'all', per_page: 100)
issues.each do |issue|
read_issues(issue)
end
next_url = client.last_response.rels[:next]
while next_url
puts next_url.href
response = next_url.get
issues = response.data
issues.each do |issue|
read_issues(issue)
end
next_url = response.rels[:next]
end
- Use command below to trigger a migration
curl --location 'http://gdk.test:3000/api/v4/import/github' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer GDK_ACCESS_TOKEN' \
--data '{
"personal_access_token": "GITHUB_ACCESS_TOKEN",
"repo_id": "238972",
"target_namespace": "root",
"new_name": "rspec-core",
"optional_stages": {
"attachments_import": false,
"collaborators_import": false
}
}'
- Check if everything was migrated as before
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.