Changes candidate table to Keyset Pagination
What does this MR do and why?
Follow up from !108242 (merged), changes page pagination to keyset pagination
Screenshots
Before | after |
---|---|
Database
Fetching second page, with query, ordering by metric, limit 30:
SELECT
ml_candidates.*,
latest.value AS metric_value
FROM
"ml_candidates"
INNER JOIN ( SELECT DISTINCT ON (candidate_id, name)
*
FROM
"ml_candidate_metrics"
WHERE
"ml_candidate_metrics"."name" = 'accuracy'
ORDER BY
candidate_id,
name,
id DESC) latest ON latest.candidate_id = ml_candidates.id
WHERE
"ml_candidates"."experiment_id" = 13
AND (ml_candidates.name LIKE '%andidate%')
AND ('7.1' IS NULL
AND "latest"."value" IS NULL
AND "ml_candidates"."id" < 125
OR '7.1' IS NOT NULL
AND "latest"."value" IS NULL
OR '7.1' IS NOT NULL
AND "latest"."value" < '7.1'
OR '7.1' IS NOT NULL
AND "latest"."value" = '7.1'
AND "ml_candidates"."id" < 125)
ORDER BY
"latest"."value" DESC,
"ml_candidates"."id" DESC
LIMIT 31
Execution Plan https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/14866/commands/51923
Limit (cost=12.28..12.29 rows=1 width=114) (actual time=2.656..2.659 rows=0 loops=1)
Buffers: shared hit=9 read=1
I/O Timings: read=2.584 write=0.000
-> Sort (cost=12.28..12.29 rows=1 width=114) (actual time=2.655..2.657 rows=0 loops=1)
Sort Key: latest.value DESC, ml_candidates.id DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=9 read=1
I/O Timings: read=2.584 write=0.000
-> Nested Loop (cost=9.21..12.27 rows=1 width=114) (actual time=2.621..2.623 rows=0 loops=1)
Buffers: shared hit=3 read=1
I/O Timings: read=2.584 write=0.000
-> Index Scan using index_ml_candidates_on_experiment_id_and_iid on public.ml_candidates (cost=0.14..3.16 rows=1 width=106) (actual time=2.620..2.620 rows=0 loops=1)
Index Cond: (ml_candidates.experiment_id = 13)
Filter: (ml_candidates.name ~~ '%andidate%'::text)
Rows Removed by Filter: 0
Buffers: shared hit=3 read=1
I/O Timings: read=2.584 write=0.000
-> Subquery Scan on latest (cost=9.07..9.10 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=0)
Filter: ((latest.value IS NULL) OR (latest.value < '7.1'::double precision) OR (latest.value = '7.1'::double precision))
Rows Removed by Filter: 0
I/O Timings: read=0.000 write=0.000
-> Unique (cost=9.07..9.08 rows=1 width=90) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
-> Sort (cost=9.07..9.08 rows=1 width=90) (actual time=0.000..0.000 rows=0 loops=0)
Sort Key: ml_candidate_metrics.candidate_id, ml_candidate_metrics.id DESC
I/O Timings: read=0.000 write=0.000
-> Seq Scan on public.ml_candidate_metrics (cost=0.00..9.06 rows=1 width=90) (actual time=0.000..0.000 rows=0 loops=0)
Filter: (ml_candidate_metrics.name = 'accuracy'::text)
Rows Removed by Filter: 0
I/O Timings: read=0.000 write=0.000
How to set up and validate locally
-
Enable the feature flag
echo "Feature.enable(:ml_experiment_tracking)" | bundle exec rails c
-
Create an Experiment and 10 candidates
# Added variables user_id = 1 project_id = 1 exp = Ml::Experiment.create!(name: "Awesome Gitlab Experiment", user_id: user_id, project_id: project_id) 100.times.each { |i| exp.candidates.create!(user_id: user_id, start_time: 0, name: "candidate_#{i}") } exp.candidates.each_with_index { |c, i| c.metrics.create!(name: "auc", value: i*0.1 , tracked_at: Time.zone.now, step: 1)} exp.candidates.each_with_index { |c, i| c.metrics.create!(name: "accuracy", value: i* 0.1+0.1 , tracked_at: Time.zone.now, step: 1)} exp.candidates.each { |c| c.params.create!(name: "algorithm", value: ["LogisticRegression", "DecisionTree"].sample )}
-
Navigate to
<your project>/-/ml/experiments
and click on "Awesome Gitlab Experiment" -
Verify that pagination uses cursor now
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Eduardo Bonet