When sorting by merged_at only use merge_request_metrics
What does this MR do?
Resolves #276387 (closed)
This MR attempts extend the keyset pagination implementation for GraphQL to allow custom ordering where the primary_table.id
is not part of the sorting. The custom ordering must be "stable" (tie breaker).
The change introduces a new class called OrderList
which knows about the OrderItems
. Iy can tell and validate if we have "stable" sorting present.
Example Query:
query test {
project(fullPath: "gnuwget/wget2") {
name
webUrl
mergeRequests(sort: MERGED_AT_DESC) {
nodes {
iid
}
}
}
}
Problem:
Unfortunately, when using GraphQL to fetch merge requests, the query after the change is still ordering by merge_requests.id
, and that's because of lib/gitlab/graphql/pagination/keyset/connection.rb:170
attempting to add a tie-breaking column:
if list&.last&.attribute_name != items.primary_key
items.order(arel_table[items.primary_key].desc) # rubocop: disable CodeReuse/ActiveRecord
end
attribute_name
in our case is merge_request_metrics.merged_at
which is not the primary key and so the additional ordering is added. This makes sense in general, but not here, because we want to avoid sorting by columns in different tables, and we already tie-break by sorting on merge_request_metrics.merge_request_id DESC
, so adding a primary key is unnecessary.
Note that it would be ok to include in the sorting the primary key of merge_request_metrics
, but that's not what the pagination code is doing, because the arel_table
is merge_requests
there. I am not sure how to proceed in a way that is robust and generic.
See the query:
SELECT
"merge_requests".*,
"merge_request_metrics"."merged_at" AS "merge_request_metrics.merged_at"
FROM
"merge_requests"
INNER JOIN "merge_request_metrics" ON "merge_request_metrics"."merge_request_id" = "merge_requests"."id"
WHERE
"merge_requests"."target_project_id" = 3
AND "merge_requests"."target_project_id" = "merge_request_metrics"."target_project_id"
AND "merge_request_metrics"."merged_at" >= '2019-10-07 22:00:00'
AND "merge_request_metrics"."merged_at" <= '2020-10-06 22:00:00'
AND "merge_requests"."target_project_id" = "merge_request_metrics"."target_project_id"
ORDER BY
merge_request_metrics.merged_at DESC NULLS LAST,
merge_request_metrics.merge_request_id DESC,
"merge_requests"."id" DESC
LIMIT 20;
The query above is inefficient, because it needs to load all metrics rows in order to apply sorting by the merge_requests.id
column.
Query after MR changes:
SELECT
"merge_requests".*,
"merge_request_metrics"."merged_at" AS "merge_request_metrics.merged_at",
"merge_request_metrics"."id" AS "merge_request_metrics.id"
FROM
"merge_requests"
INNER JOIN "merge_request_metrics" ON "merge_request_metrics"."merge_request_id" = "merge_requests"."id"
WHERE
"merge_requests"."target_project_id" = 1
AND "merge_requests"."target_project_id" = "merge_request_metrics"."target_project_id"
AND "merge_request_metrics"."merged_at" >= '2019-10-07 22:00:00'
AND "merge_request_metrics"."merged_at" <= '2020-10-06 22:00:00'
AND "merge_requests"."target_project_id" = "merge_request_metrics"."target_project_id"
ORDER BY
merge_request_metrics.merged_at DESC NULLS LAST,
"merge_request_metrics"."id" DESC
LIMIT 20
-> ORDER BY "merge_requests"."id" DESC
is gone.
This query is efficient: it fetches 20 rows from the DB since the order columns are part of the index.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team