Remove leftover commit documents from the main index
What does this MR do and why?
This MR will remove the leftover commit
documents from the main index.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Open the rails console
bundle exec rails c
- Populate
commits
in the main index
project = Project.last
::Gitlab::Search::Client.new.index(index: 'gitlab-development', routing: "project_#{project.id}", refresh: true,
body: { commit: { type: 'commit',
author: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
committer: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
rid: project.id, message: 'test' },
join_field: { name: 'commit', parent: "project_#{project.id}" },
repository_access_level: project.repository_access_level, type: 'commit',
visibility_level: project.visibility_level })
project = Project.first
::Gitlab::Search::Client.new.index(index: 'gitlab-development', routing: "project_#{project.id}", refresh: true,
body: { commit: { type: 'commit',
author: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
committer: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
rid: project.id, message: 'test' },
join_field: { name: 'commit', parent: "project_#{project.id}" },
repository_access_level: project.repository_access_level, type: 'commit',
visibility_level: project.visibility_level })
- Ensure there is at least one commit in the main index by running the following curl command in bash
curl -XGET "http://localhost:9200/gitlab-development/_count" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
"query": {
"bool": {
"filter": [
{ "term": { "type": "commit" } }
]
}
}
}'
count
should be greater than 0
- Now run the following command in the rails console
Elastic::DataMigrationService[20230911205548].send(:migration).migrate
- Run again the curl command and ensure the
count
is0
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Runtime
~ 141 minutes
[1] pry(main)> batch_size = 2_000
=> 2000
[2] pry(main)> throttle_delay = 3.minute
=> 3 minutes
[3] pry(main)> number_of_documents = 94076
=> 94076
[4] pry(main)> (number_of_documents / batch_size) * throttle_delay
=> 141 minutes
Related to #419781 (closed)
Edited by Ravi Kumar