Add circuit breaker for zoekt nodes
What does this MR do and why?
Whenever an operation against a zoekt node fails, we trigger a backoff which will exclude that node from searches until the backoff expires. This implements an exponential backoff strategy. When all zoekt nodes are in a backoff state, the circuit breaker is tripped and zoekt integration is disabled completely until at least one node's backoff period expires.
It's important to note that as long as a single node is operational, the circuit breaker is not tripped and will still perform zoekt searches.
Related to #393445 (closed)
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Configure local zoekt environment
::Feature.enable(:index_code_with_zoekt)
::Feature.enable(:search_code_with_zoekt)
::Feature.enable(:zoekt_node_backoffs)
zoekt_node = ::Search::Zoekt::Node.find_or_create_by!(index_base_url: 'http://127.0.0.1:6080/', search_base_url: 'http://127.0.0.1:6090/') { |n| n.uuid = SecureRandom.uuid }
namespace = Namespace.find_by_full_path("flightjs") # Some namespace you want to enable
::Zoekt::IndexedNamespace.find_or_create_by!(node: zoekt_node, namespace: namespace.root_ancestor)
-
Go to
flightjs
and perform searches. Notice thatExact code search (powered by zoekt)
is on right side on top of search results -
In the console, simulate some failures by manually triggering a
backoff
. This example will trigger the circuit breaker and disable zoekt for roughly16 seconds
4.times { zoekt_node.backoff.backoff! }
- Zoekt node's backoff should be enabled. You can verify the remaining backoff time by running
zoekt_node.backoff.seconds_remaining
. You can run this multiple times. It should decrease over time. - Go to
flightjs
and perform searches. Notice thatExact code search (powered by zoekt)
is no longer there. We did a fall back to using Elasticsearch if enabled or basic search using the DB otherwise. - Wait until
zoekt_node.backoff.seconds_remaining
reaches zero and backoff expires. - Go to
flightjs
and perform searches. Notice thatExact code search (powered by zoekt)
is on right side on top of search results - Zoekt node's backoff should be disabled. You can verify by running
zoekt_node.backoff.enabled?
. It should returnfalse
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.