Skip to content

Fix selection of synced nodes

Patrick Steinhardt requested to merge pks-coordinator-fix-test into master

I was hitting a test failure on my local machine in TestCoordinatorStreamDirector_distributesReads. The root cause is that we select a node we're distributing the read to at random, but the test makes the assumption we'll always choose the secondary node, which doesn't hold. I've rewritten the test to execute the stream director 16 times, checking that we've been hitting both primary and secondary nodes.

But this didn't fix the test yet, which made me suspicious. The code was correct, but I noticed that we were including the primary node twice in GetSyncedNodes() and as a result the likelihood of choosing the primary node is two-thirds (as there's only one secondary). I've fixed the code to not explicitly insert the primary anymore, which works just fine for the memory-backed datastore. I'm not a 100% sure whether it also works for the PSQL-backed datastore, though. In case it doesn't, we should probably just add the primary and then deduplicate afterwards.

Merge request reports

Loading