[VSA] Add a new object to keep track of data collection progress
With the async data collection in place, we need an object where we track the status of the data collection.
We need to handle the following processes:
- Incremental update
- Store the last processed
updated_at
value - Store enough historical data to predict the total runtime (the previous runtime, records processed).
- Store the last processed
- Full re-aggregation (with consistency check) - #344804 (closed)
- Store enough historical data to predict the total runtime.
Idea:
Each top-level namespace that has the required license (premium, ultimate) should have one row in the analytics_cycle_analytics_queue
table.
Options:
- This record would be lazily created once the subscription starts.
- Manual, the user needs to enable the feature by toggling a flag.
Table: analytics_cycle_analytics_aggregation_status
- group_id (unique, must be root group)
- last_processed_updated_at (timestamp)
- previous_runtimes, integer array (store the last 10 runtimes in seconds)
- previous_processed_records, integer array (store the number of processed records from the last 10 runs)
- last_full_run_at (timestamp)
- last_full_run_runtimes
- last_full_run_processed_records
How should it work (before 15.0):
- UI will have a toggle, where users can opt in to the new behaviour. #341739 (closed)
- When opting in, create the aggregation status record and enqueue a full run job (async).
Aggregation execution
When the aggregation job runs (Analytics::CycleAnalytics::GroupDataLoaderWorker
), update the aggregation status record with the collected metadata (timestamps, record counts).
Breakdown
- Migration for aggregation status
- API for opting in/out
- Invoke the aggregation job
- Update the aggregation job to collect metadata
Edited by Adam Hegyi