Cache database_id before doing other work
What does this MR do and why?
This MR changes the GitHub importer to cache the id
of an imported issue or MR as soon as it is created.
There are ~2k examples in our production logs that can result in us not caching an ID for a GitHub PR.
The stack trace of those errors show the exception happened here which is the line above where we currently write the id
of the imported MR to the cache.
This later leads to data not being imported. For example in NoteImporter
, not finding the id
here results in us not importing the note. Many objects related to the issue or MR also similarly will fail.
If we write the ID as soon as we have it, it makes these importers more robust, as we can import objects for the affected issues and MRs, like notes.
Screenshots
Following the QA notes below results in (note the note counts on the right side):
master |
This branch |
---|---|
One of the imported PRs showing on master
only notes related to approvals were imported, but on this branch all notes were imported.
master |
This branch |
---|---|
How to set up and validate locally
- Apply the following patch to have an error occur during the import of all issues and PRs.
diff --git a/lib/gitlab/github_import/importer/issue_importer.rb b/lib/gitlab/github_import/importer/issue_importer.rb
index 44e9cd1bd8f1..074e06bd7987 100644
--- a/lib/gitlab/github_import/importer/issue_importer.rb
+++ b/lib/gitlab/github_import/importer/issue_importer.rb
@@ -73,6 +73,8 @@ def create_issue
#
# issue_id - The ID of the created issue.
def create_assignees(issue_id)
+ raise
+
assignees = []
issue.assignees.each do |assignee|
diff --git a/lib/gitlab/github_import/importer/pull_request_importer.rb b/lib/gitlab/github_import/importer/pull_request_importer.rb
index acdafef670ca..dd8dc6be88c6 100644
--- a/lib/gitlab/github_import/importer/pull_request_importer.rb
+++ b/lib/gitlab/github_import/importer/pull_request_importer.rb
@@ -72,6 +72,8 @@ def set_merge_request_assignees(merge_request)
end
def insert_git_data(merge_request, already_exists)
+ raise
+
insert_or_replace_git_data(merge_request, pull_request.source_branch_sha, pull_request.target_branch_sha, already_exists)
# We need to create the branch after the merge request is
# populated to ensure the merge request is in the right state
- Import a small GitHub project to your local GDK. For example, to import https://github.com/ku1ik/rainbow use the below
curl
.
curl --request POST \
--url "http://gdk.test:3000/api/v4/import/github" \
--header "content-type: application/json" \
--header "PRIVATE-TOKEN: <GITLAB PAT>" \
--data '{
"personal_access_token": "<GITHUB PAT>",
"repo_id": "69648",
"target_namespace": "<TARGET LOCAL NAMESPACE>",
"new_name": "ld-cache_database_id-asap-test"
}'
- On this branch all comments from the PR and issues will be imported successfully. On
master
, with a similar patch applied as above, there will only be comments that are related to approvals imported (see #416486 (comment 1588888745) for an explanation of why only these kinds of notes appear).
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.