Geo: ElasticCommitIndexerWorker should not run on secondaries
Problem
/var/log/gitlab/gitlab-rails/elasticsearch.log:{"severity":"DEBUG","time":"2023-09-29T15:44:17.656Z","correlation_id":"01HBGSSBMBC6NED2CPVEK6X5QX","meta.caller_id":"ElasticCommitIndexerWorker","meta.remote_ip":"140.211.10.9","meta.feature_category":"global_search","meta.user":"root-admin","meta.user_id":1,"meta.client_id":"user/1","meta.root_caller_id":"graphql:replicableTypeUpdate","class":"Gitlab::Elastic::Indexer","message":"indexing_commit_range","group_id":2,"project_id":342,"from_sha":"dcef6f63e0bc123fa2f77000051ff87a63b1ada6","to_sha":"dcef6f63e0bc123fa2f77000051ff87a63b1ada6","index_wiki":false}
/var/log/gitlab/gitlab-rails/elasticsearch.log:{"severity":"INFO","time":"2023-09-29T15:44:17.706Z","correlation_id":"01HBGSSBMBC6NED2CPVEK6X5QX","meta.caller_id":"ElasticCommitIndexerWorker","meta.remote_ip":"140.211.10.9","meta.feature_category":"global_search","meta.user":"root-admin","meta.user_id":1,"meta.client_id":"user/1","meta.root_caller_id":"graphql:replicableTypeUpdate","class":"Gitlab::Elastic::Indexer","message":"time=\"2023-09-29T15:44:17Z\" level=info msg=\"Setting timeout\" timeout=30m0s\n","status":0,"group_id":2,"project_id":342,"from_sha":"dcef6f63e0bc123fa2f77000051ff87a63b1ada6","to_sha":"dcef6f63e0bc123fa2f77000051ff87a63b1ada6","index_wiki":false}
/var/log/gitlab/gitlab-rails/graphql_json.log:{"severity":"INFO","time":"2023-09-29T15:44:16.571Z","correlation_id":"01HBGSSBMBC6NED2CPVEK6X5QX","meta.caller_id":"graphql:replicableTypeUpdate","meta.remote_ip":"140.211.10.9","meta.feature_category":"not_owned","meta.user":"root-admin","meta.user_id":1,"meta.client_id":"user/1","trace_type":"execute_query","query_fingerprint":"replicableTypeUpdate/br-ZSLGYJfpLcFheudRCH8OmLr6O8DsZ5XAU33a9h34=/3/99Bvn2dYCOGl0y7SKrc-0ttTWztjUQ9TcO7Vj6JvfEg=","duration_s":0.023422811180353165,"operation_name":"replicableTypeUpdate","operation_fingerprint":"replicableTypeUpdate/br-ZSLGYJfpLcFheudRCH8OmLr6O8DsZ5XAU33a9h34=","is_mutation":true,"variables":"{\"action\"=\u003e\"RESYNC\", \"registryId\"=\u003e\"gid://gitlab/Geo::ProjectRepositoryRegistry/97199\", \"registryClass\"=\u003e\"PROJECT_REPOSITORY_REGISTRY\"}","query_string":"mutation replicableTypeUpdate($action: GeoRegistryAction!, $registryId: GeoBaseRegistryID!, $registryClass: GeoRegistryClass!) {\n geoRegistriesUpdate(\n input: {action: $action, registryId: $registryId, registryClass: $registryClass}\n ) {\n errors\n __typename\n }\n}\n","query_analysis.duration_s":0.0009616799652576447,"query_analysis.depth":2,"query_analysis.complexity":3,"query_analysis.used_fields":["GeoRegistriesUpdatePayload.errors","GeoRegistriesUpdatePayload.__typename","Mutation.geoRegistriesUpdate"],"query_analysis.used_deprecated_fields":["Mutation.geoRegistriesUpdate"]}
Possible fix
As described in #426778 (comment 1586913933):
There is no explicit call from the mutation. I can see that when the method update_root_ref is executed as part of the Geo::FrameworkRepositorySyncService class when fetching the repository, it is possible to reach a condition where the after_repository_change_head method gets executed and the Repositories::DefaultBranchChangedEvent event is published (this was a recent addition) and this event points to the Search::ElasticDefaultBranchChangedWorker, reaching this condition.
We must add a condition to check if we are in a secondary site before triggering the elastic worker. However, I don't see a correlation between this exception raised in the background and the synchronization failure.
Implementation Guide
- Search for
ElasticCommitIndexerWorker.perform_
in the gitlab codebase. I found 5 calls as of 11 Oct 2023. - Since these calls are all in
ee/
directory already, we can add Geo-related code without having to create new EE modules - Prevent these calls when
Gitlab::Geo.secondary?
- Add tests
Edited by Michael Kozono