Skip to content

Create a service to compute burnup charts for milestones

Heinrich Lee Yu requested to merge milestone-burnup-charts-backend into master

What does this MR do?

For burndown charts, we send the raw events to the frontend. We do not want to do this for the new burnup charts so we created this service to aggregate the data in the backend. We would eventually expose this data via the API / GraphQL.

Related to #213334 (closed)

This service does one query for all the events:

SQL query:
((
SELECT
  'milestone' AS event_type, created_at, milestone_id AS value, action, issue_id 
FROM
  "resource_milestone_events" 
WHERE
  "resource_milestone_events"."issue_id" IN 
  (
    SELECT DISTINCT
      "resource_milestone_events"."issue_id" 
    FROM
      "resource_milestone_events" 
    WHERE
      "resource_milestone_events"."milestone_id" = 1233752 
      AND "resource_milestone_events"."action" = 1 
      AND 
      (
        created_at <= '2020-08-17 23:59:59.999999'
      )
  )
  AND 
  (
    created_at <= '2020-08-17 23:59:59.999999'
  )
) 
UNION
(
SELECT
  'state' AS event_type, created_at, state AS value, NULL AS action, issue_id 
FROM
  "resource_state_events" 
WHERE
  "resource_state_events"."issue_id" IN 
  (
    SELECT DISTINCT
      "resource_milestone_events"."issue_id" 
    FROM
      "resource_milestone_events" 
    WHERE
      "resource_milestone_events"."milestone_id" = 1233752
      AND "resource_milestone_events"."action" = 1 
      AND 
      (
        created_at <= '2020-08-17 23:59:59.999999'
      )
  )
  AND 
  (
    created_at <= '2020-08-17 23:59:59.999999'
  )
) 
UNION
(
SELECT
  'weight' AS event_type, created_at, weight AS value, NULL AS action, issue_id 
FROM
  "resource_weight_events" 
WHERE
  "resource_weight_events"."issue_id" IN 
  (
    SELECT DISTINCT
      "resource_milestone_events"."issue_id" 
    FROM
      "resource_milestone_events" 
    WHERE
      "resource_milestone_events"."milestone_id" = 1233752
      AND "resource_milestone_events"."action" = 1 
      AND 
      (
        created_at <= '2020-08-17 23:59:59.999999'
      )
  )
  AND 
  (
    created_at <= '2020-08-17 23:59:59.999999'
  )
)) 
ORDER BY created_at
LIMIT 50001
Query plan for a large milestone on GitLab.com (before index)
 Limit  (cost=54262.46..54317.99 rows=22211 width=58) (actual time=13828.915..13831.675 rows=8762 loops=1)
   Buffers: shared hit=41713 read=9940 dirtied=439
   I/O Timings: read=13483.272
   ->  Sort  (cost=54262.46..54317.99 rows=22211 width=58) (actual time=13828.913..13830.167 rows=8762 loops=1)
         Sort Key: resource_milestone_events.created_at
         Sort Method: quicksort  Memory: 1069kB
         Buffers: shared hit=41713 read=9940 dirtied=439
         I/O Timings: read=13483.272
         ->  HashAggregate  (cost=52436.83..52658.94 rows=22211 width=58) (actual time=13821.608..13824.837 rows=8762 loops=1)
               Group Key: ('milestone'::text), resource_milestone_events.created_at, resource_milestone_events.milestone_id, resource_milestone_events.action, resource_milestone_events.issue_id
               Buffers: shared hit=41710 read=9940 dirtied=439
               I/O Timings: read=13483.272
               ->  Append  (cost=4796.84..52159.19 rows=22211 width=58) (actual time=4842.174..13799.998 rows=8762 loops=1)
                     Buffers: shared hit=41710 read=9940 dirtied=439
                     I/O Timings: read=13483.272
                     ->  Nested Loop  (cost=4796.84..22132.39 rows=8993 width=58) (actual time=4842.172..10218.435 rows=7170 loops=1)
                           Buffers: shared hit=14215 read=7258 dirtied=426
                           I/O Timings: read=10005.887
                           ->  HashAggregate  (cost=4796.41..4848.74 rows=5233 width=8) (actual time=4834.942..4838.598 rows=3020 loops=1)
                                 Group Key: resource_milestone_events_1.issue_id
                                 Buffers: shared hit=2174 read=3069 dirtied=353
                                 I/O Timings: read=4717.000
                                 ->  Index Scan using index_resource_milestone_events_on_milestone_id on public.resource_milestone_events resource_milestone_events_1  (cost=0.43..4783.31 rows=5243 width=8) (actual time=11.451..4825.774 rows=5254 loops=1)
                                       Index Cond: (resource_milestone_events_1.milestone_id = 1233752)
                                       Filter: ((resource_milestone_events_1.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone) AND (resource_milestone_events_1.action = 1))
                                       Rows Removed by Filter: 78
                                       Buffers: shared hit=2174 read=3069 dirtied=353
                                       I/O Timings: read=4717.000
                           ->  Index Scan using index_resource_milestone_events_on_issue_id on public.resource_milestone_events  (cost=0.43..3.27 rows=2 width=26) (actual time=0.880..1.778 rows=2 loops=3020)
                                 Index Cond: (resource_milestone_events.issue_id = resource_milestone_events_1.issue_id)
                                 Filter: (resource_milestone_events.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone)
                                 Rows Removed by Filter: 0
                                 Buffers: shared hit=12041 read=4189 dirtied=73
                                 I/O Timings: read=5288.887
                     ->  Subquery Scan on *SELECT* 2  (cost=4796.84..19563.96 rows=6143 width=58) (actual time=42.200..1983.551 rows=562 loops=1)
                           Buffers: shared hit=13423 read=1460 dirtied=5
                           I/O Timings: read=1925.696
                           ->  Nested Loop  (cost=4796.84..19487.17 rows=6143 width=52) (actual time=42.197..1982.876 rows=562 loops=1)
                                 Buffers: shared hit=13423 read=1460 dirtied=5
                                 I/O Timings: read=1925.696
                                 ->  HashAggregate  (cost=4796.41..4848.74 rows=5233 width=8) (actual time=9.463..12.701 rows=3020 loops=1)
                                       Group Key: resource_milestone_events_2.issue_id
                                       Buffers: shared hit=5240
                                       ->  Index Scan using index_resource_milestone_events_on_milestone_id on public.resource_milestone_events resource_milestone_events_2  (cost=0.43..4783.31 rows=5243 width=8) (actual time=0.227..7.288 rows=5254 loops=1)
                                             Index Cond: (resource_milestone_events_2.milestone_id = 1233752)
                                             Filter: ((resource_milestone_events_2.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone) AND (resource_milestone_events_2.action = 1))
                                             Rows Removed by Filter: 78
                                             Buffers: shared hit=5240
                                 ->  Index Scan using index_resource_state_events_on_issue_id_and_created_at on public.resource_state_events  (cost=0.43..2.78 rows=1 width=18) (actual time=0.623..0.651 rows=0 loops=3020)
                                       Index Cond: ((resource_state_events.issue_id = resource_milestone_events_2.issue_id) AND (resource_state_events.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone))
                                       Buffers: shared hit=8183 read=1460 dirtied=5
                                       I/O Timings: read=1925.696
                     ->  Subquery Scan on *SELECT* 3  (cost=4796.83..10261.86 rows=7075 width=58) (actual time=16.617..1594.896 rows=1030 loops=1)
                           Buffers: shared hit=14072 read=1222 dirtied=8
                           I/O Timings: read=1551.689
                           ->  Nested Loop  (cost=4796.83..10173.42 rows=7075 width=54) (actual time=16.615..1594.053 rows=1030 loops=1)
                                 Buffers: shared hit=14072 read=1222 dirtied=8
                                 I/O Timings: read=1551.689
                                 ->  HashAggregate  (cost=4796.41..4848.74 rows=5233 width=8) (actual time=9.061..11.377 rows=3020 loops=1)
                                       Group Key: resource_milestone_events_3.issue_id
                                       Buffers: shared hit=5240
                                       ->  Index Scan using index_resource_milestone_events_on_milestone_id on public.resource_milestone_events resource_milestone_events_3  (cost=0.43..4783.31 rows=5243 width=8) (actual time=0.047..6.878 rows=5254 loops=1)
                                             Index Cond: (resource_milestone_events_3.milestone_id = 1233752)
                                             Filter: ((resource_milestone_events_3.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone) AND (resource_milestone_events_3.action = 1))
                                             Rows Removed by Filter: 78
                                             Buffers: shared hit=5240
                                 ->  Index Scan using index_resource_weight_events_on_issue_id_and_weight on public.resource_weight_events  (cost=0.42..1.00 rows=1 width=20) (actual time=0.484..0.523 rows=0 loops=3020)
                                       Index Cond: (resource_weight_events.issue_id = resource_milestone_events_3.issue_id)
                                       Filter: (resource_weight_events.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone)
                                       Rows Removed by Filter: 0
                                       Buffers: shared hit=8832 read=1222 dirtied=8
                                       I/O Timings: read=1551.689
Time: 13.835 s
  - planning: 1.190 ms
  - execution: 13.834 s
    - I/O read: 13.483 s
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 41713 (~325.90 MiB) from the buffer pool
  - reads: 9940 (~77.70 MiB) from the OS file cache, including disk I/O
  - dirtied: 439 (~3.40 MiB)
  - writes: 0

https://explain.depesz.com/s/Pcyk

Query plan for a large milestone on GitLab.com (after index)
 Limit  (cost=53882.20..53937.93 rows=22291 width=58) (actual time=252.750..254.773 rows=8762 loops=1)
   Buffers: shared hit=40571 read=4473 dirtied=407
   I/O Timings: read=155.150
   ->  Sort  (cost=53882.20..53937.93 rows=22291 width=58) (actual time=252.749..253.646 rows=8762 loops=1)
         Sort Key: resource_milestone_events.created_at
         Sort Method: quicksort  Memory: 1069kB
         Buffers: shared hit=40571 read=4473 dirtied=407
         I/O Timings: read=155.150
         ->  HashAggregate  (cost=52049.42..52272.33 rows=22291 width=58) (actual time=246.127..249.222 rows=8762 loops=1)
               Group Key: ('milestone'::text), resource_milestone_events.created_at, resource_milestone_events.milestone_id, resource_milestone_events.action, resource_milestone_events.issue_id
               Buffers: shared hit=40568 read=4473 dirtied=407
               I/O Timings: read=155.150
               ->  Append  (cost=4632.89..51770.78 rows=22291 width=58) (actual time=15.167..238.935 rows=8762 loops=1)
                     Buffers: shared hit=40568 read=4473 dirtied=407
                     I/O Timings: read=155.150
                     ->  Nested Loop  (cost=4632.89..22018.06 rows=9025 width=58) (actual time=15.166..95.099 rows=7170 loops=1)
                           Buffers: shared hit=17479 read=1791 dirtied=394
                           I/O Timings: read=51.337
                           ->  HashAggregate  (cost=4632.46..4684.98 rows=5252 width=8) (actual time=13.051..14.336 rows=3020 loops=1)
                                 Group Key: resource_milestone_events_1.issue_id
                                 Buffers: shared hit=3023 read=17 dirtied=319
                                 I/O Timings: read=0.598
                                 ->  Index Scan using index_resource_milestone_events_on_milestone_id_and_add_action on public.resource_milestone_events resource_milestone_events_1  (cost=0.43..4619.30 rows=5262 width=8) (actual time=0.187..11.346 rows=5254 loops=1)
                                       Index Cond: (resource_milestone_events_1.milestone_id = 1233752)
                                       Filter: (resource_milestone_events_1.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone)
                                       Rows Removed by Filter: 0
                                       Buffers: shared hit=3023 read=17 dirtied=319
                                       I/O Timings: read=0.598
                           ->  Index Scan using index_resource_milestone_events_on_issue_id on public.resource_milestone_events  (cost=0.43..3.27 rows=2 width=26) (actual time=0.022..0.026 rows=2 loops=3020)
                                 Index Cond: (resource_milestone_events.issue_id = resource_milestone_events_1.issue_id)
                                 Filter: (resource_milestone_events.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone)
                                 Rows Removed by Filter: 0
                                 Buffers: shared hit=14456 read=1774 dirtied=75
                                 I/O Timings: read=50.739
                     ->  Subquery Scan on *SELECT* 2  (cost=4632.89..19443.85 rows=6165 width=58) (actual time=19.094..93.488 rows=562 loops=1)
                           Buffers: shared hit=11220 read=1460 dirtied=5
                           I/O Timings: read=74.691
                           ->  Nested Loop  (cost=4632.89..19366.79 rows=6165 width=52) (actual time=19.092..93.189 rows=562 loops=1)
                                 Buffers: shared hit=11220 read=1460 dirtied=5
                                 I/O Timings: read=74.691
                                 ->  HashAggregate  (cost=4632.46..4684.98 rows=5252 width=8) (actual time=5.998..7.101 rows=3020 loops=1)
                                       Group Key: resource_milestone_events_2.issue_id
                                       Buffers: shared hit=3037
                                       ->  Index Scan using index_resource_milestone_events_on_milestone_id_and_add_action on public.resource_milestone_events resource_milestone_events_2  (cost=0.43..4619.30 rows=5262 width=8) (actual time=0.022..4.470 rows=5254 loops=1)
                                             Index Cond: (resource_milestone_events_2.milestone_id = 1233752)
                                             Filter: (resource_milestone_events_2.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone)
                                             Rows Removed by Filter: 0
                                             Buffers: shared hit=3037
                                 ->  Index Scan using index_resource_state_events_on_issue_id_and_created_at on public.resource_state_events  (cost=0.43..2.78 rows=1 width=18) (actual time=0.026..0.028 rows=0 loops=3020)
                                       Index Cond: ((resource_state_events.issue_id = resource_milestone_events_2.issue_id) AND (resource_state_events.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone))
                                       Buffers: shared hit=8183 read=1460 dirtied=5
                                       I/O Timings: read=74.691
                     ->  Subquery Scan on *SELECT* 3  (cost=4632.88..10107.16 rows=7101 width=58) (actual time=8.100..48.933 rows=1030 loops=1)
                           Buffers: shared hit=11869 read=1222 dirtied=8
                           I/O Timings: read=29.122
                           ->  Nested Loop  (cost=4632.88..10018.40 rows=7101 width=54) (actual time=8.099..48.519 rows=1030 loops=1)
                                 Buffers: shared hit=11869 read=1222 dirtied=8
                                 I/O Timings: read=29.122
                                 ->  HashAggregate  (cost=4632.46..4684.98 rows=5252 width=8) (actual time=6.109..7.130 rows=3020 loops=1)
                                       Group Key: resource_milestone_events_3.issue_id
                                       Buffers: shared hit=3037
                                       ->  Index Scan using index_resource_milestone_events_on_milestone_id_and_add_action on public.resource_milestone_events resource_milestone_events_3  (cost=0.43..4619.30 rows=5262 width=8) (actual time=0.041..4.592 rows=5254 loops=1)
                                             Index Cond: (resource_milestone_events_3.milestone_id = 1233752)
                                             Filter: (resource_milestone_events_3.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone)
                                             Rows Removed by Filter: 0
                                             Buffers: shared hit=3037
                                 ->  Index Scan using index_resource_weight_events_on_issue_id_and_weight on public.resource_weight_events  (cost=0.42..1.00 rows=1 width=20) (actual time=0.012..0.013 rows=0 loops=3020)
                                       Index Cond: (resource_weight_events.issue_id = resource_milestone_events_3.issue_id)
                                       Filter: (resource_weight_events.created_at <= '2020-08-17 23:59:59.999999+00'::timestamp with time zone)
                                       Rows Removed by Filter: 0
                                       Buffers: shared hit=8832 read=1222 dirtied=8
                                       I/O Timings: read=29.122
Time: 257.493 ms
  - planning: 1.415 ms
  - execution: 256.078 ms
    - I/O read: 155.150 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 40571 (~317.00 MiB) from the buffer pool
  - reads: 4473 (~34.90 MiB) from the OS file cache, including disk I/O
  - dirtied: 407 (~3.20 MiB)
  - writes: 0

https://explain.depesz.com/s/9Cag

Migration output

Up

== 20200729181641 AddIndexToResourceMilestoneEventsAddEvents: migrating =======
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:resource_milestone_events, :milestone_id, {:where=>"action = 1", :name=>"index_resource_milestone_events_on_milestone_id_and_add_action", :algorithm=>:concurrently})
   -> 0.0028s
-- add_index(:resource_milestone_events, :milestone_id, {:where=>"action = 1", :name=>"index_resource_milestone_events_on_milestone_id_and_add_action", :algorithm=>:concurrently})
   -> 0.0027s
== 20200729181641 AddIndexToResourceMilestoneEventsAddEvents: migrated (0.0058s)

Down

== 20200729181641 AddIndexToResourceMilestoneEventsAddEvents: reverting =======
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:resource_milestone_events, :milestone_id, {:name=>"index_resource_milestone_events_on_milestone_id_and_add_action", :algorithm=>:concurrently})
   -> 0.0044s
-- remove_index(:resource_milestone_events, {:name=>"index_resource_milestone_events_on_milestone_id_and_add_action", :algorithm=>:concurrently, :column=>:milestone_id})
   -> 0.0036s
== 20200729181641 AddIndexToResourceMilestoneEventsAddEvents: reverted (0.0084s)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Heinrich Lee Yu

Merge request reports

Loading