Create multiple indices on big namespace rollout
Problem to solve
We log an error when a namespace can not be indexed in a node. This is blocking us from rolling out big namespaces on gitlab.com
.
Proposal
When a big namespace that cannot be indexed in a single node is rolling out, we should use the Zoekt Sharding strategy with replicas to create multiple indices.
Add a new bjson
column in the zoekt_indices
table. It should have integer
attributes project_id_to
and project_id_from
. project_id_from
represents the first project id of a namespace for which the zoekt index is assigned and project_id_to
represents the last project id of a namespace for which the zoekt index is assigned. Currently, we are just logging the error when a namespace can not be indexed in a single node. With this issue, we should instead iterate over each project of a big namespace and assign it to a new zoekt_index
until it is filled, and repeat this process until all the projects are assigned an index. For now, set a maximum limit of 5
index per namespace. If a namespace can not be indexed within 5 zoekt_indices
we will skip the indexing.