Skip to content

Expose VSA incremental aggregation metadata

Adam Hegyi requested to merge 344803-expose-vsa-aggregation-metadata into master

What does this MR do and why?

This MR exposes metadata for the Value Stream Analytics aggregation progress. Related issue: #341739 (closed)

Not user-facing change.

Design: #341739[Screenshot_2021-10-07_at_14.43._2x.png]

We store an Analytics::CycleAnalytics::Aggregation metadata record for each top-level namespace where we keep track of various metrics. For the mentioned issue above, we expose three extra data attributes:

  • enabled - is the aggregation process enabled? - will be always true after 15.0
  • last_run_at - when was the group aggregated the last time. This data is stored in the Analytics::CycleAnalytics::Aggregation record.
  • next_run_at - here we attempt to provide a naive estimation for the next aggregation execution.

Both timestamp attributes can be nil. There are a few cases where we cannot estimate next_run_at properly (new GL instance or first group that uses the feature).

Estimation formula:

group last_run_at
g1 2022-01-05
g2 2022-01-02
g3 2022-01-02
g4 2022-01-03
g5 2022-01-01

The background aggregation is happening every 10 minutes via a CRON job. The job will pick up the aggregation records ordered by priority order (earliest first) and processes them one by one.

For this example, the Group g4 was aggregated at 2022-01-03. The earliest aggregated record is at 2022-01-01, the duration between these dates are 2 days, so the next runtime will be in about 2 days. g5, g2, g3 needs to be aggregated before g4. Note that the last_run_at will be always bumped when the aggregation for the group finishes.

Note: this is an oversimplified example, normally we shouldn't see such a high time gap between the records.

The estimation will also take into account the previous runtimes. For example, if the current group took a few minutes to aggregate, then this time will be added to the estimation.

Screenshots or screen recordings

image

How to set up and validate locally

Generate some data and run it in the console:

a = FactoryBot.create(:cycle_analytics_aggregation, last_incremental_run_at: 5.days.ago)
b = FactoryBot.create(:cycle_analytics_aggregation, last_incremental_run_at: 3.days.ago)
c = FactoryBot.create(:cycle_analytics_aggregation, last_incremental_run_at: 2.days.ago)

b.estimated_next_run_at

Database

Determining the earliest run at: https://explain.depesz.com/s/qGjf

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #344803 (closed)

Edited by Adam Hegyi

Merge request reports

Loading