Fix junk data in issues index
Summary
There are multiple data problems in the issues index:
- Documents with
_id
likeissue_140042675
- causing duplicate documents - Documents with
schema_version
2312
when the current version is2405
- Documents with
type
work_item
- Disparity between number of unique IDs and total number of docs
Attempted fixes:
- Added checks in ee/lib/search/elastic/references/work_item.rb and ee/app/models/concerns/search/elastic/issues_search.rb to stop indexing workitems
-
ReindexAllIssues to add all documents from the database.
- First run
2024-01-02T21:06:31
- Second run
2024-01-05T07:26:08
- First run
-
RemoveIssueDocumentsBasedOnSchemaVersion to remove docs with schema version
< 2312
-
RemoveWorkItemFromIssuesIndex to remove all documents with
type=workitem
and schema version< 2312
Suggested fix:
- Migration to remove all docs with
type=work_item
from the index - Bump schema version to
2407
- Migration to reindex all issues from the database - every
Issue.without_issue_type(:epic)
record is tracked and will have schema version2407
- Migration to remove docs with schema version < 2407 from index
Edited by Madelein van Niekerk