Redis I/O ordering fixes (!1054) · Merge requests · GitLab.org / cluster-integration / GitLab Agent for Kubernetes

You need to sign in or sign up before continuing.

Mikhail Mazurskiy requested to merge ash2k/redis-io-fixes into master Jul 28, 2023

Let's look at the following scenario. Both tunnels are for the same agent id.

tunnel 1	tunnel 2
tunnel is established
`tracker.RedisTracker.RegisterTunnel()` is called. It locks the mutex.
It increments the counter of tunnels for agent id. The counter is 1 and this kas url is added to the in-memory map. It's not persisted into Redis immediately, but a function is returned to perform the I/O.
Mutex is unlocked
`agent id -> kas URL` mapping is persisted in Redis using the returned I/O function
~~~~~ some time later ~~~~~	~~~~~ some time later ~~~~~
request 1 arrives and tunnel 1 is picked to serve the request.
`tracker.RedisTracker.UnregisterTunnel()` is called. It locks the mutex. It decrements the counter of tunnels for agent id. The counter is 0 and this kas url is removed from the in-memory map. It's not removed from Redis immediately, but a function is returned to perform the I/O.
Mutex is unlocked
	tunnel is established
	`tracker.RedisTracker.RegisterTunnel()` is called. It locks the mutex.
	It increments the counter of tunnels for agent id. The counter is 1 and this kas url is added to the in-memory map. It's not persisted into Redis immediately, but a function is returned to perform the I/O.
	Mutex is unlocked
	`agent id -> kas URL` mapping is persisted in Redis using the returned I/O function
tunnel is unregistered in redis using the returned I/O function 💣

In the above scenario we should have kas URL registered for the agent id because there actually is a tunnel established, but because of the interleaving in goroutine execution, removal I/O is performed after tunnel 2 is registered, undoing the registration. Hence, nothing is in Redis for the agent id.

agent id -> kas URL mapping is persisted only once when the first tunnel in this kas instance is established. It's removed when the number of tunnels for an agent id reaches 0. Because of that once the above scenario happens, no matter how many tunnels are established, nothing is written into Redis and hence other kas instances cannot find the kas with all the established tunnels.

kas stores the above mappings in Redis' hash. The hash has a TTL on it (time to live i.e. after how much time data should be discarded) and on each key->value mapping. If a kas instance crashes, incorrect data will be eventually GCed by Redis and/or other kas instances, which check TTL before using the data and remove stale mappings. Each kas instance refreshes the data it owns by periodically writing that data back into Redis, refreshing the TTL. TTL is 5 minutes, refresh is done every 4 minutes by default.

Given the above, once the situation occurs, a kas instance with established tunnels effectively is invisible to other kas instances for a particular agent id tunnels. It can still use the tunnels if it's the instance handling an incoming request, but other kas instances wouldn't send requests to it. This will last for up to 4 minutes, after which data is put into Redis where it can be discovered by other kas instances. Then this kas instance starts serving requests for this agent id again.

This MR is basically a rollback of !723 (merged) and !703 (merged). These two MRs moved Redis I/O out of the mutex so that I/O for refresh, GC, registrations and unregistrations can be done concurrently. But this breaks atomicity of writes to in-memory map and to Redis, it introduces a race between GC/refresh and registrations/unregistrations. They can clobber each others writes, like in the example above.

This MR takes the conservative approach of serializing all Redis I/O for the tunnel tracker using a single mutex. This is bullet-proof but really not ideal. I have a good idea of how to remove this bottleneck without introducing any problems (sigh). I'll do it in a follow up.

p.s. this took me 3 months to find.

Edited Jul 28, 2023 by Mikhail Mazurskiy

Redis I/O ordering fixes

Merge request reports