Skip to content

Optimise events count sql and pagination

Alexandru Croitor requested to merge attempt_to_optimise_events_count_sql into master

What does this MR do and why?

When doing count on the events no need to preload the extra associations as that does not affect the counts but makes query slower as it needs to load way more data than needed.

Database

Project events count

-- Fingerprint: 30aaf3b9a4d909b9
SELECT
    COUNT(DISTINCT "events"."id")
FROM
    "events"
    INNER JOIN "projects" ON "projects"."id" = "events"."project_id"
    LEFT OUTER JOIN "project_features" ON "project_features"."project_id" = "projects"."id"
    LEFT OUTER JOIN "project_import_data" ON "project_import_data"."project_id" = "projects"."id"
    LEFT OUTER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    LEFT OUTER JOIN "users" ON "users"."id" = "events"."author_id"
WHERE
    "events"."project_id" = 278964
    AND (EXISTS (
            SELECT
                1
            FROM
                "project_authorizations"
            WHERE
                "project_authorizations"."user_id" = 3483274
                AND (project_authorizations.project_id = projects.id))
            OR projects.visibility_level IN (10, 20));
                                          QUERY PLAN
------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2792276.54..2792276.55 rows=1 width=8) (actual time=51665.298..51665.314 rows=1 loops=1)
   ->  Nested Loop  (cost=1.44..2786623.21 rows=2261332 width=8) (actual time=5.692..50900.694 rows=2528197 loops=1)
         ->  Nested Loop Left Join  (cost=0.86..10.51 rows=1 width=4) (actual time=0.867..0.907 rows=1 loops=1)
               Join Filter: (project_import_data.project_id = projects.id)
               ->  Index Scan using projects_pkey on projects  (cost=0.44..7.05 rows=1 width=8) (actual time=0.771..0.796 rows=1 loops=1)
                     Index Cond: (id = 278964)
                     Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (visibility_level = ANY ('{10,20}'::integer[])))
                     SubPlan 1
                       ->  Index Only Scan using project_authorizations_pkey on project_authorizations  (cost=0.57..3.59 rows=1 width=0) (actual time=0.531..0.533 rows=1 loops=1)
                             Index Cond: ((user_id = 3483274) AND (project_id = projects.id))
                             Heap Fetches: 0
                     SubPlan 2
                       ->  Index Only Scan using project_authorizations_pkey on project_authorizations project_authorizations_1  (cost=0.57..1621.30 rows=6203 width=4) (never executed)
                             Index Cond: (user_id = 3483274)
                             Heap Fetches: 0
               ->  Index Only Scan using index_project_import_data_on_project_id on project_import_data  (cost=0.42..3.44 rows=1 width=4) (actual time=0.094..0.094 rows=0 loops=1)
                     Index Cond: (project_id = 278964)
                     Heap Fetches: 0
         ->  Index Scan using index_events_on_project_id_and_id on events  (cost=0.58..2763999.38 rows=2261332 width=16) (actual time=4.823..50376.182 rows=2528197 loops=1)
               Index Cond: (project_id = 278964)
 Planning Time: 21.679 ms
 Execution Time: 51665.784 ms
(22 rows)
Slightly modified above query

explain analyze
SELECT
    COUNT(*)
FROM
    "events"
    INNER JOIN "projects" ON "projects"."id" = "events"."project_id"
WHERE
    "events"."project_id" = 278964
    AND (EXISTS (
            SELECT
                1
            FROM
                "project_authorizations"
            WHERE
                "project_authorizations"."user_id" = 3483274
                AND (project_authorizations.project_id = projects.id))
            OR projects.visibility_level IN (10, 20));
                                          QUERY PLAN
------------------------------------------------------------------------------------------------------
 Aggregate  (cost=83220.13..83220.14 rows=1 width=8) (actual time=1277.742..1277.746 rows=1 loops=1)
   ->  Nested Loop  (cost=1.02..77566.80 rows=2261332 width=0) (actual time=0.147..1030.170 rows=2528197 loops=1)
         ->  Index Scan using projects_pkey on projects  (cost=0.44..7.05 rows=1 width=4) (actual time=0.077..0.082 rows=1 loops=1)
               Index Cond: (id = 278964)
               Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (visibility_level = ANY ('{10,20}'::integer[])))
               SubPlan 1
                 ->  Index Only Scan using project_authorizations_pkey on project_authorizations  (cost=0.57..3.59 rows=1 width=0) (actual time=0.043..0.044 rows=1 loops=1)
                       Index Cond: ((user_id = 3483274) AND (project_id = projects.id))
                       Heap Fetches: 0
               SubPlan 2
                 ->  Index Only Scan using project_authorizations_pkey on project_authorizations project_authorizations_1  (cost=0.57..1621.30 rows=6203 width=4) (never executed)
                       Index Cond: (user_id = 3483274)
                       Heap Fetches: 0
         ->  Index Only Scan using index_events_on_project_id_and_id on events  (cost=0.58..54946.43 rows=2261332 width=4) (actual time=0.070..621.520 rows=2528197 loops=1)
               Index Cond: (project_id = 278964)
               Heap Fetches: 18112
 Planning Time: 8.807 ms
 Execution Time: 1277.977 ms
(18 rows)

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Alexandru Croitor

Merge request reports

Loading