Skip to content

Support sending traffic to redis-cache read replica

Igor requested to merge redis-read-replica into master

What does this MR do?

This is a proof of concept for improving the scalability of redis-cache by sending traffic to read replicas.

There are some redis keys that may not be safe for stale reads. Thus, the idea here is to perform stale reads selectively, we can incrementally migrate calls that are deemed safe to replicas. That can be done by passing stale_ok: true to Rails.cache read calls.

We perform a relatively crude load balancing across replicas: the sentinel support in redis-rb will select a random healthy replica at connection time.

The motivation behind this proposal is documented in https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/9414. This patch implements some ideas suggested there by @stanhu.

Screenshots

n/a

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

In order to test this change, I added rudimentary redis sentinel support to gdk: gdk@igor-sentinel.

This is quite close to what we are doing in production on gitlab.com.

Currently only a single key pattern is marked as safe for stale reads: cache:gitlab:Appearance:*.

By looking at a jaeger trace, we can see that almost all redis calls are going to the master (port 6381):

Screenshot_2020-03-09_at_15.59.11

But the reads for Appearance are going to a replica (port 6383):

Screenshot_2020-03-09_at_15.49.41

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Igor

Merge request reports

Loading