Skip to content

Remove leftover commit documents from the main index

What does this MR do and why?

This MR will remove the leftover commit documents from the main index.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  1. Open the rails console
bundle exec rails c
  1. Populate commits in the main index
project = Project.last
::Gitlab::Search::Client.new.index(index: 'gitlab-development', routing: "project_#{project.id}", refresh: true,
    body: { commit: { type: 'commit',
        author: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
        committer: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
      rid: project.id, message: 'test' },
    join_field: { name: 'commit', parent: "project_#{project.id}" },
    repository_access_level: project.repository_access_level, type: 'commit',
  visibility_level: project.visibility_level })
project = Project.first
::Gitlab::Search::Client.new.index(index: 'gitlab-development', routing: "project_#{project.id}", refresh: true,
    body: { commit: { type: 'commit',
        author: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
        committer: { name: 'F L', email: 't@t.com', time: Time.now.strftime('%Y%m%dT%H%M%S+0000') },
      rid: project.id, message: 'test' },
    join_field: { name: 'commit', parent: "project_#{project.id}" },
    repository_access_level: project.repository_access_level, type: 'commit',
  visibility_level: project.visibility_level })
  1. Ensure there is at least one commit in the main index by running the following curl command in bash
curl -XGET "http://localhost:9200/gitlab-development/_count" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "type": "commit" } }
      ]
    }
  }
}'

count should be greater than 0

  1. Now run the following command in the rails console
 Elastic::DataMigrationService[20230911205548].send(:migration).migrate
  1. Run again the curl command and ensure the count is 0

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Runtime

~ 141 minutes

[1] pry(main)> batch_size = 2_000
=> 2000
[2] pry(main)> throttle_delay = 3.minute
=> 3 minutes
[3] pry(main)> number_of_documents = 94076
=> 94076
[4] pry(main)> (number_of_documents / batch_size) * throttle_delay
=> 141 minutes

Related to #419781 (closed)

Edited by Ravi Kumar

Merge request reports

Loading