Improve performance of user contribution graph query
What does this MR do and why?
Some contribution graphs are very slow. Specially for users that have many years of contributions. This is mainly due to the inefficient filtering on created_at
.
Sample URLs:
This MR improves this by doing 3 things:
-
Simplifies the query so that we
GROUP BY date
outside theUNION
and return the totals instead of aggregating in Ruby. This also removes extra columns we're selecting that are never used. This results in the query returning at most 366 rows instead of 1 row per target type, action, and project. -
Avoids the use of the
projects
table to get theproject_id
since we can useproject_features.project_id
. This change moves the scopes to theProjectFeature
model so we can use it without joining withProject
. -
Improves the index by adding
action
,target_type
, andcreated_at
.Before, it was doing something like this:
-> Index Scan using index_events_on_author_id_and_project_id on public.events (cost=0.58..3.60 rows=1 width=12) (actual time=0.376..1.098 rows=51 loops=120) Index Cond: ((events.author_id = 64248) AND (events.project_id = projects.id)) Filter: ((events.created_at >= '2020-11-22 00:00:00+00'::timestamp with time zone) AND (events.created_at <= '2021-11-22 23:59:59.999999+00'::timestamp with time zone) AND (events.action = 5)) Rows Removed by Filter: 599 Buffers: shared hit=73456 I/O Timings: read=0.000 write=0.000
With this new index:
-> Index Only Scan using index_events_on_author_id_and_project_id_action_target_type_cre on public.events (cost=0.70..2.27 rows=1 width=12) (actual time=0.010..0.045 rows=51 loops=120) Index Cond: ((events.author_id = 64248) AND (events.project_id = project_features.project_id) AND (events.action = 5) AND (events.target_type IS NULL) AND (events.created_at >= '2020-11-22 00:00:00+00'::timestamp with time zone) AND (events.created_at <= '2021-11-22 23:59:59.999999+00'::timestamp with time zone)) Heap Fetches: 61 Buffers: shared hit=2190 I/O Timings: read=0.000 write=0.000
-
Old query: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7355/commands/26211
-
New query: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7318/commands/26102
Migration output
== 20211122033501 ImproveIndexOnEventsForCalendar: migrating ==================
== 20211122033501 ImproveIndexOnEventsForCalendar: migrated (0.0026s) =========
== 20211122033501 ImproveIndexOnEventsForCalendar: reverting ==================
== 20211122033501 ImproveIndexOnEventsForCalendar: reverted (0.0023s) =========
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.