Skip to content

Add id_before, id_after filter param to projects API

Andreas Brandl requested to merge ab/projects-id-filter into master

What does this MR do?

This adds a id_before and id_after filter to the projects API. It is an extract from !19623 (merged) and a preparation to support keyset pagination on id.

Performance-wise, this translates into filters on projects.id > ? and projects.id < ?.

The standard example for GET /api/v4/projects?id_after=X is:

SELECT
    "projects".*
FROM
    "projects"
WHERE
    "projects"."visibility_level" = 20
    AND (id > 2348234)
    AND "projects"."pending_delete" = FALSE
ORDER BY
    "projects"."created_at" DESC,
    "projects"."id" DESC
LIMIT 20 OFFSET 0

These plans are surprisingly bad and we might want to fix this as this is our default filters/order criteria for GET /api/v4/projects. I opened a separate issue for this in #36135 (closed).

The point here is that the additional filter doesn't make it worse.

Let's look at this (rather unfortunate) example:

gitlabhq_production=# explain analyze select * from projects where id > 1 and name ~* 'idontexist' limit 20;

Plan: https://explain.depesz.com/s/3ZTv

So we basically always have id indexed as it's the primary key. We only rarely have composite indexes with id as a prefix, which would be ideal for these types of queries. In my testing, Postgres did a good job of choosing the right index strategy based on selectivity.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

In the worst case, we'll see a combination of filters that yields a bad query plan. I can imagine this would be something like using an index for id > ? first and then ultimately scanning the whole index/table to evaluate other conditions. I haven't been able to produce a request like this in my testing though, as Postgres did make good index choices based on statistics.

Edited by 🤖 GitLab Bot 🤖

Merge request reports

Loading