Create total count metric type that supports time frames
Background
Currently we offer two ways of counting with Redis (RedisMetric and RedisHLL).
They behave a bit different as outlined here
Redis | RedisHLL | |
---|---|---|
Counts | Total number of invocations | Number of distinct arguments passed |
Time frames reported | All time | 7d, 28d |
Precision | Accurate | Probabilistic |
Memory footprint per metric | Few 100 bytes | Up to 12kB x 29 |
(The memory footprint is an over-approximation and we might have to look at the actual usage)
We have experienced (example 1) that our users sometimes want to count the total number of times an event happen with a time frame.
Two examples are:
- Number of pipelines created (7d and 28d)
- Number of merge requests created (7d and 28d)
Currently they have two options:
- Use a Redis metric and calculate the time framed values in the data warehouse - which is error prone.
- Use a RedisHLL metric and always pass a unique number - ex. pass the id of a merge request when they want to count how many merge requests were created. (It works because a merge request only is created once).
redis_hll_counters.code_review.i_code_review_create_mr_monthly
and it's weekly metric cousin are an example for doing this.
Desired Outcome
- Users can use
7d
and28d
timeframes together with total count metrics when usingdata_source: internal_events
. - 7d and 28d version of https://gitlab.com/gitlab-org/gitlab/blob/master/ee/config/metrics/counts_all/20230502122452_analytics_dashboard_views.yml is created
Suggested Solution
Based on the suggestion and POC code in #411264 (comment 1392077260) we suggest to use Redis
as a storage solution for this. The outcome are these steps:
- The logic to generate keys for event storage should be refactored for general usage in both Redis and RedisHLL within #415046 (closed)
- Add usage of the refactored logic to the way we handle events with Redis
- enable
time_frame: 7d / 28d
fordata_source: events
NOT IN SCOPE: We do not intend to add the availability of those time_frames
to the legacy metrics using data_source: redis
, however if this is an unintended side-effect, we also do not have to go though extra steps to prevent them from being added.