Let tie breaker order follow global sort direction
What does this MR do?
We use a "tie breaker" for consistently ordering the results in the API. This is particularly relevant for cases where the order-direction is based on a non-unique column, for example name
(for projects). Let's assume we want to order by name asc
, then the tie breaker id desc
is added to the query. This results in a query with ORDER BY name ASC, id DESC
, such that the order is well defined.
Now, additionally to order_by
options, we typically also allow to specify sort=asc|desc
to define the direction. However, the tie breaker order is always the same: id DESC
. This results in "mixed" directions like in the example above, where we order by name ASC
and id DESC
.
Consider we only had a "standard" index on name ASC, id ASC
. We can't use it for name ASC, id DESC
:
ORDER BY clause | can use standard index? |
---|---|
name ASC, id ASC |
|
name ASC, id DESC |
|
name DESC, id ASC |
|
name DESC, id DESC |
In order to support these combinations, we have to have the following indexes:
Index | Forward scan | Backwards scan |
---|---|---|
name ASC, id ASC |
name ASC, id ASC |
name DESC, id DESC |
name ASC, id DESC |
name ASC, id DESC |
name DESC, id ASC |
With the tie breaker always defined as id DESC
, we only need two scan types here - unfortunately each requires their own index.
With this change, we let the tie breaker follow the actual sorting direction. That is, we only have two combinations now:
name ASC, id ASC
name DESC, id DESC
Both are supported by a standard index (see above). We don't need another index on name ASC, id DESC
.
Change in behavior
There is a change in behavior here: Let's use order_by=name
as an example. This only applies for sort=asc
: If there are records in a page with a duplicate name
, those records are now going to show up in the reverse direction than before. The same applies to all other order_by
options across the API.
There are no CI tests that specifically check for this behavior. I have not found any documentation that talks about it, too.
Relates to: #195881 (closed)
Next step
The next step is to validate that the targeted indexes are not needed anymore. We can do this by checking the index statistics on the database hosts:
gitlabhq_production=# select * from pg_stat_user_indexes where indexrelname ~* 'vis20.*desc';
relid | indexrelid | schemaname | relname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch
-------+------------+------------+----------+---------------------------------------------------+----------+--------------+---------------
33706 | 364202457 | public | projects | index_projects_api_vis20_created_at_id_desc | 260 | 35072 | 34712
33706 | 364202526 | public | projects | index_projects_api_vis20_last_activity_at_id_desc | 1684 | 11953789 | 11380596
33706 | 364202598 | public | projects | index_projects_api_vis20_updated_at_id_desc | 214 | 132258 | 132254
33706 | 364202675 | public | projects | index_projects_api_vis20_name_id_desc | 198 | 235620 | 235481
33706 | 364202768 | public | projects | index_projects_api_vis20_path_id_desc | 241 | 256361 | 256297
(5 rows)
Those stats should not increase anymore after the change got deployed, assuming the indexes are only picked up for API-related queries.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
The behavior we are changing here is not covered by existing specs. This MR adds specs for the helper but doesn't come with request spec changes.
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.