Clusters::Applications::Ingress move distinct_by_user batch counting
What does this MR do?
Move ::Clusters::Applications::Ingress.where(time_period).distinct_by_user,
to batch distinct count
The purpose of the change
This changes are a performance improvement of count queries and a refactoring. We prefer to use batch counting and optimize indexes if is the case in order to have the queries under 1 sec and avoid the timeouts.
Proposed solution
- Refactor the counter to use batch counting
- Ensure that the resulted queries are < 1 sec in database-lab
- No need for optimization
time_period = {}
clusters_applications_cert_managers: distinct_count(::Clusters::Applications::Ingress.where(time_period).available.joins(:cluster), 'clusters.user_id'),
time_period = { created_at: 28.days.ago..Time.current }
clusters_applications_cert_managers: distinct_count(::Clusters::Applications::Ingress.where(time_period).available.joins(:cluster), 'clusters.user_id'),
Queries
MIN with no period
SELECT MIN(clusters.user_id)
FROM "clusters_applications_ingress"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_ingress"."cluster_id"
WHERE "clusters_applications_ingress"."status" IN (3,5)
https://explain.depesz.com/s/Fvjh
Time: 265.366 msFilter: (clusters_applications_ingress.status = ANY ('{3,5}'::integer[]))
MAX with no period
SELECT MAX(clusters.user_id)
FROM "clusters_applications_ingress"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_ingress"."cluster_id"
WHERE "clusters_applications_ingress"."status" IN (3,5)
https://explain.depesz.com/s/aqfm
Time: 23.725 msFilter: (clusters_applications_ingress.status = ANY ('{3,5}'::integer[]))
COUNT with no period
SELECT COUNT(DISTINCT clusters.user_id)
FROM "clusters_applications_ingress"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_ingress"."cluster_id"
WHERE "clusters_applications_ingress"."status" IN (3,5)
AND "clusters"."user_id" BETWEEN 0 AND 9999
https://explain.depesz.com/s/rSZb
Time: 2.417 msFilter: Filter: (clusters_applications_ingress.status = ANY ('{3,5}'::integer[]))
MIN with period
SELECT MIN(clusters.user_id)
FROM "clusters_applications_ingress"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_ingress"."cluster_id"
WHERE "clusters_applications_ingress"."created_at" BETWEEN '2020-03-10 07:15:09.560591' AND '2020-04-07 07:15:09.560752'
AND "clusters_applications_ingress"."status" IN (3,5)
https://explain.depesz.com/s/ZWel
Time: 5.637 msFilter: ((clusters_applications_ingress.created_at >= '2020-03-10 07:15:09.560591+00'::timestamp with time zone) AND (clusters_applications_ingress.created_at <= '2020-04-07 07:15:09.560752+00'::timestamp with time zone) AND (clusters_applications_ingress.status = ANY ('{3,5}'::integer[])))
MAX with period
SELECT MAX(clusters.user_id)
FROM "clusters_applications_ingress"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_ingress"."cluster_id"
WHERE "clusters_applications_ingress"."created_at" BETWEEN '2020-03-10 07:15:09.560591' AND '2020-04-07 07:15:09.560752'
AND "clusters_applications_ingress"."status" IN (3,5)
https://explain.depesz.com/s/jQ5L
Time: 3.865 msFilter: ((clusters_applications_ingress.created_at >= '2020-03-10 07:15:09.560591+00'::timestamp with time zone) AND (clusters_applications_ingress.created_at <= '2020-04-07 07:15:09.560752+00'::timestamp with time zone) AND (clusters_applications_ingress.status = ANY ('{3,5}'::integer[])))
COUNT with period
SELECT COUNT(DISTINCT clusters.user_id)
FROM "clusters_applications_ingress"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_ingress"."cluster_id"
WHERE "clusters_applications_ingress"."created_at" BETWEEN '2020-03-10 07:15:09.560591' AND '2020-04-07 07:15:09.560752'
AND "clusters_applications_ingress"."status" IN (3, 5)
AND "clusters"."user_id" BETWEEN 0 AND 9999
https://explain.depesz.com/s/8P01k
Time: 1.356 msFilter: ((clusters_applications_ingress.created_at >= '2020-03-10 07:15:09.560591+00'::timestamp with time zone) AND (clusters_applications_ingress.created_at <= '2020-04-07 07:15:09.560752+00'::timestamp with time zone) AND (clusters_applications_ingress.status = ANY ('{3,5}'::integer[])))
Part of #212962 (closed)