Skip to content

Redis latency monitoring: focus on slower requests

Jacob Vosmaer requested to merge jv-redis-latency-buckets into master

What does this MR do?

This MR is refining the Prometheus histogram buckets for gitlab_redis_client_requests_duration_seconds. In the first iteration of this metric, we made an effort not to add too many buckets, because this adds to much data for Prometheus to keep track of. In gitlab-com/runbooks!2542 (merged) we realized that in order to have useful Redis latency monitoring, we need to focus the histogram buckets on slower requests.

In this MR we remove the 0.001s bucket. While a lot of Redis calls do take less than that, knowing this is not useful for monitoring. At the tail end, we add 0.1s and 0.5s buckets.

Removing histogram buckets can cause problems in our monitoring framework, because for example our apdex queries use specific le="123" selectors. But in this case we are modifying a metric we're not relying on yet so the removal of 0.001 should be fine.

This is part of gitlab-com/gl-infra/scalability#439 (closed)

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Jacob Vosmaer

Merge request reports

Loading