Fix repo pushes messing with initial Elasticsearch indexing
What does this MR do?
If a push is received to a project before the initial Elasticsearch indexing begins, then ElasticCommitIndexerWorker
will set the project's IndexStatus
to the last commit in that new push. When ElasticBatchProjectIndexerWorker
finally gets to that project, it will be skipped because it will see that it already has an IndexStatus
set.
To fix this, we change ElasticBatchProjectIndexerWorker
to only care about IndexStatus
if UPDATE_INDEX
has been set. This can result in some data being indexed twice, but that is preferable (and would not result in duplicates) to having the data not indexed at all.
What are the relevant issue numbers?
-
#8013 (closed) - Race condition while indexing new projects: Turns out this fix is not enough to fix this, though I'm currently having trouble reproducing. The fix should go in another MR - #8628 (closed) - Repository pushes while Indexing on ElasticSearch omits data
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary -
Documentation created/updated via this MR -
Documentation reviewed by technical writer or follow-up review issue created -
Tests added for this feature/bug -
Tested in all supported browsers -
Conforms to the code review guidelines -
Conforms to the merge request performance guidelines -
Conforms to the style guides -
Conforms to the database guides -
Link to e2e tests MR added if this MR has Requires e2e tests label. See the Test Planning Process. -
EE specific content should be in the top level /ee
folder -
For a paid feature, have we considered GitLab.com plans, how it works for groups, and is there a design for promoting it to users who aren't on the correct plan? -
Security reports checked/validated by reviewer
Closes #8628 (closed)
Edited by Coung Ngo