Project-level DORA metrics API (!55823) · Merge requests · GitLab.org / GitLab

Shinya Maeda requested to merge dora-metrics-rest-api into master Mar 05, 2021

What does this MR do?

In the previous MR, we started collecting daily DORA metrics in order to improve the performance issue on the current API. This MR is to add an API support to fetch the collected data. Specifically, two types of metrics will be exposed by the API - "Deployment Frequency" and "Lead Time For Changes".

The API path is api/v4/projects/:project_id/dora/metrics and takes the following parameters:

Parameter	Required/Optional	Purpose
`metric`	required	The metric type. One of `deployment_frequency` or `lead_time_for_changes`.
`after`	optional	Date range to start from. Default is 3 months ago.
`before`	optional	Date range to end at. Default is the current date.
`interval`	optional	One of `all`, `monthly` or `daily`. Default is `daily`.
`environment_tier`	optional	The tier of the environment. Default is `production`.

A few notes:

This MR is behind dora_daily_metrics feature flag, which is disabled by default.
In favor of this new API, the old APIs will be deprecated due to the performance issue.
You can see the full PoC in this MR. We'll add group-level API at the next.
Related #291746 (closed)

Aggregation patterns

User can specify the aggregation patterns via interval parameter. There are three patterns that have different queries to aggregate dora_daily_metrics rows.

Daily ... This is to aggregate the target rows in daily bucket.
Monthly ... This is to aggregate the target rows in monthly bucket.
All ... This is to aggregate the target rows in one bucket.

There are two metric types that use different aggregate functions:

Deployment Frequency ... It summarizes the total number of daily deployment frequency (count).
Lead Time for Changes ... It calculates the median of daily lead time for changes (median seconds).

Manual QA

Deployment Frequency

> # Daily
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/projects/35/dora/metrics?metric=deployment_frequency&after=2021-03-01"
[{"2021-03-01":3},{"2021-03-02":6},{"2021-03-03":0},{"2021-03-04":0},{"2021-03-05":0},{"2021-03-06":0},{"2021-03-07":0},{"2021-03-08":4}]

> # Monthly
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/projects/35/dora/metrics?metric=deployment_frequency&after=2021-03-01&interval=monthly"
[{"2021-03-01":13}]

> # All
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/projects/35/dora/metrics?metric=deployment_frequency&after=2021-03-01&interval=all"
13

Sample query

SELECT DATE_TRUNC('month', date)::date AS month, SUM(deployment_frequency) AS data
FROM "dora_daily_metrics"
WHERE "dora_daily_metrics"."environment_id" IN (SELECT "environments"."id" FROM "environments" WHERE "environments"."project_id" = $1 AND "environments"."tier" = $2 AND (tier IS NOT NULL)) AND
  (date BETWEEN '2021-03-01' AND '2021-03-08')
GROUP BY DATE_TRUNC('month', date)
ORDER BY month ASC

Plan

 Sort  (cost=6.15..6.15 rows=1 width=20) (actual time=0.052..0.054 rows=0 loops=1)
   Sort Key: (((date_trunc('month'::text, (dora_daily_metrics.date)::timestamp with time zone)))::date)
   Sort Method: quicksort  Memory: 25kB
   Buffers: shared hit=10
   ->  Aggregate  (cost=6.11..6.14 rows=1 width=20) (actual time=0.031..0.032 rows=0 loops=1)
         Group Key: (date_trunc('month'::text, (dora_daily_metrics.date)::timestamp with time zone))
         Buffers: shared hit=7
         ->  Sort  (cost=6.11..6.11 rows=1 width=16) (actual time=0.030..0.031 rows=0 loops=1)
               Sort Key: (date_trunc('month'::text, (dora_daily_metrics.date)::timestamp with time zone))
               Sort Method: quicksort  Memory: 25kB
               Buffers: shared hit=7
               ->  Nested Loop  (cost=0.28..6.10 rows=1 width=16) (actual time=0.012..0.013 rows=0 loops=1)
                     Buffers: shared hit=4
                     ->  Index Scan using index_environments_on_project_id_and_tier on public.environments  (cost=0.12..2.91 rows=1 width=4) (actual time=0.012..0.012 rows=0 loops=1)
                           Index Cond: ((environments.project_id = 278964) AND (environments.tier = 0))
                           Buffers: shared hit=4
                     ->  Index Scan using index_dora_daily_metrics_on_environment_id_and_date on public.dora_daily_metrics  (cost=0.15..3.17 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=0)
                           Index Cond: ((dora_daily_metrics.environment_id = environments.id) AND (dora_daily_metrics.date >= '2021-03-01'::date) AND (dora_daily_metrics.date <= '2021-03-08'::date))

Timing

Time: 1.757 ms
  - planning: 1.548 ms
  - execution: 0.209 ms
    - I/O read: N/A
    - I/O write: N/A

Shared buffers:
  - hits: 10 (~80.00 KiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Lead Time For Changes

> # Daily
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/projects/35/dora/metrics?metric=lead_time_for_changes&after=2021-03-01"
[{"2021-03-01":3.0},{"2021-03-02":6.0},{"2021-03-03":null},{"2021-03-04":null},{"2021-03-05":null},{"2021-03-06":null},{"2021-03-07":null},{"2021-03-08":545333.0}]

> # Monthly
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/projects/35/dora/metrics?metric=lead_time_for_changes&after=2021-03-01&interval=monthly"
[{"2021-03-01":6.0}]

> # All
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/projects/35/dora/metrics?metric=lead_time_for_changes&after=2021-03-01&interval=all"
6.0

Sample query

SELECT DATE_TRUNC('month', date)::date AS month, (PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY lead_time_for_changes_in_seconds)) AS data
FROM "dora_daily_metrics"
WHERE "dora_daily_metrics"."environment_id" IN (SELECT "environments"."id" FROM "environments" WHERE "environments"."project_id" = $1 AND "environments"."tier" = $2 AND (tier IS NOT NULL)) AND
  (date BETWEEN '2021-03-01' AND '2021-03-08')
GROUP BY DATE_TRUNC('month', date)
ORDER BY month ASC

Plan

 Sort  (cost=6.15..6.16 rows=1 width=20) (actual time=0.073..0.074 rows=0 loops=1)
   Sort Key: (((date_trunc('month'::text, (dora_daily_metrics.date)::timestamp with time zone)))::date)
   Sort Method: quicksort  Memory: 25kB
   Buffers: shared hit=10
   ->  Aggregate  (cost=6.11..6.14 rows=1 width=20) (actual time=0.034..0.034 rows=0 loops=1)
         Group Key: (date_trunc('month'::text, (dora_daily_metrics.date)::timestamp with time zone))
         Buffers: shared hit=7
         ->  Sort  (cost=6.11..6.11 rows=1 width=16) (actual time=0.033..0.033 rows=0 loops=1)
               Sort Key: (date_trunc('month'::text, (dora_daily_metrics.date)::timestamp with time zone))
               Sort Method: quicksort  Memory: 25kB
               Buffers: shared hit=7
               ->  Nested Loop  (cost=0.28..6.10 rows=1 width=16) (actual time=0.017..0.018 rows=0 loops=1)
                     Buffers: shared hit=4
                     ->  Index Scan using index_environments_on_project_id_and_tier on public.environments  (cost=0.12..2.91 rows=1 width=4) (actual time=0.016..0.017 rows=0 loops=1)
                           Index Cond: ((environments.project_id = 278964) AND (environments.tier = 0))
                           Buffers: shared hit=4
                     ->  Index Scan using index_dora_daily_metrics_on_environment_id_and_date on public.dora_daily_metrics  (cost=0.15..3.17 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=0)
                           Index Cond: ((dora_daily_metrics.environment_id = environments.id) AND (dora_daily_metrics.date >= '2021-03-01'::date) AND (dora_daily_metrics.date <= '2021-03-08'::date))

Timing

Time: 0.553 ms
  - planning: 0.387 ms
  - execution: 0.166 ms
    - I/O read: N/A
    - I/O write: N/A

Shared buffers:
  - hits: 10 (~80.00 KiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Does this MR meet the acceptance criteria?

Conformity

📋 Does this MR need a changelog?
- I have included a changelog entry.
- I have not included a changelog entry because it's behind a feature flag.
[-] Documentation (if required) => TO be added at feature flag removal.
Code review guidelines
Merge request performance guidelines
Style guides
Database guides
Separation of EE specific content

Availability and Testing

Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
Tested in all supported browsers
Informed Infrastructure department of a default or new setting change, if applicable per definition of done

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

Label as security and @ mention @gitlab-com/gl-security/appsec
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
Security reports checked/validated by a reviewer from the AppSec team

Edited Mar 12, 2021 by Saikat Sarkar

Project-level DORA metrics API

What does this MR do?

Aggregation patterns

Manual QA

Deployment Frequency

Sample query

Lead Time For Changes

Sample query

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Merge request reports