Use the new VSA query backend when loading records
What does this MR do and why?
This change implements the value stream analytics records
endpoint to optionally use the aggregated backend. The new backend provides much better performance and we plan to enable it by default in 15.0.
SA runs a few different queries:
- Median (already implemented)
- Count (already implemented)
- Average (will be implemented as a follow-up)
- Related records (this MR)
Implementation
We have a central class that builds VSA queries: Gitlab::Analytics::CycleAnalytics::DataCollector
(this will go away at some point). Within this class, we optionally call the new queries by invoking Gitlab::Analytics::CycleAnalytics::Aggregated::DataCollector
.
The scopes and the base query builder is tested within the MR. The ee/spec/lib/gitlab/analytics/cycle_analytics/data_collector_spec.rb
test files have been modified to test both cases (current and new). This test file runs various high-level tests related to VSA.
How to set up and validate locally
- Enable the feature
Feature.enable(:use_vsa_aggregated_tables)
- Seed a new VSA project
SEED_CYCLE_ANALYTICS=true SEED_VSA=true FILTER=cycle_analytics rake db:seed_fu
- The seed script prints the project path, copy it and navigate to the project.
- Go to the group.
- Go to Analytics > Value Stream
- Open the top right dropdown and
Create new Value Stream
- Add a name and save.
- Start rails console and aggregate the data
group = Group.find(x) Analytics::CycleAnalytics::DataLoaderService.new(group:group, model: Issue).execute Analytics::CycleAnalytics::DataLoaderService.new(group:group, model: MergeRequest).execute
- Load the VSA page again.
- Inspecting the
records
endpoint requests, we should see that the_stage_events
tables are being used.
Database
Record loading query example: https://explain.depesz.com/s/WJEa
It's faster than the current queries (current: about 1s). I'm planning to optimize it further as a follow up using this technique.
It needs a bit more logic since the technique cannot be applied on all queries, for example when we add filters. The query performs well on the project level: https://explain.depesz.com/s/b54r
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #335391 (closed)