Improve label search with issuable optimization
What does this MR do?
This MR addresses the label finder problem which was discovered when the optimized_issuable_label_filter
feature flag was enabled on PRD (#259719 (closed))
The original query is quite slow (cached execution): https://explain.depesz.com/s/QsW3
Note: the change is behind a feature flag. Later we'd like to enable it on production for a short period of time to measure the performance.
New query:
- Skips the
ORDER BY
, we don't care about the order. - Uses
UNION ALL
since having duplicated results is acceptable (unlikely though) - Matches the index on
group_id
andproject_id
.- For this I had to replace an existing index which was "broken".
SELECT "labels"."title",
"labels"."id"
FROM (
(SELECT "labels".*
FROM "labels"
WHERE "labels"."project_id" IS NULL
AND "labels"."title" IN ('bug')
AND "labels"."group_id" IN
(WITH RECURSIVE "base_and_descendants" AS (
(SELECT "namespaces".*
FROM "namespaces"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".*
FROM "namespaces",
"base_and_descendants"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id"
FROM "base_and_descendants" AS "namespaces"))
UNION ALL
(SELECT "labels".*
FROM "labels"
WHERE "labels"."group_id" is NULL
AND "labels"."title" IN ('bug')
AND "labels"."project_id" IN
(SELECT "projects"."id"
FROM "projects"
WHERE "projects"."namespace_id" IN
(WITH RECURSIVE "base_and_descendants" AS (
(SELECT "namespaces".*
FROM "namespaces"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."id" = 9970)
UNION
(SELECT "namespaces".*
FROM "namespaces",
"base_and_descendants"
WHERE "namespaces"."type" = 'Group'
AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id"
FROM "base_and_descendants" AS "namespaces")))) labels
- Plan when 2 labels were given: https://explain.depesz.com/s/1aWh
- Plan when 1 label was given: https://explain.depesz.com/s/MOQG
The uncached measurements were <1s on db-lab.
Index replacement
I noticed that we have a weird index definition for group_id, title
:
"index_labels_on_group_id_and_title" btree (group_id, title) WHERE project_id = NULL::integer
Inspecting the index, it seems that it's empty:
Schema | Name | Type | Owner | Table | Size | Description
--------+------------------------------------+-------+--------+--------+------------+-------------
public | index_labels_on_group_id_and_title | index | gitlab | labels | 8192 bytes |
(1 row)
Verifying that the index is not in use:
schemaname | tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
------------+-----------+------------------------------------+------------+------------+------------+--------+-----------------+-------------+----------------
public | labels | index_labels_on_group_id_and_title | 1.2925e+07 | 1276 MB | 8192 bytes | N | 0 | 0 | 0
(1 row)
Up:
== 20201120125953 ReplaceUnusedLabelsIndex: migrating =========================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:labels, [:group_id, :title], {:where=>"project_id IS NULL", :name=>"index_on_labels_group_id_and_title_with_null_project_id", :algorithm=>:concurrently})
-> 0.0027s
-- add_index(:labels, [:group_id, :title], {:where=>"project_id IS NULL", :name=>"index_on_labels_group_id_and_title_with_null_project_id", :algorithm=>:concurrently})
-> 0.0062s
-- transaction_open?()
-> 0.0000s
-- indexes(:labels)
-> 0.0032s
-- remove_index(:labels, {:algorithm=>:concurrently, :name=>"index_labels_on_group_id_and_title"})
-> 0.0019s
== 20201120125953 ReplaceUnusedLabelsIndex: migrated (0.0149s) ================
Down:
== 20201120125953 ReplaceUnusedLabelsIndex: reverting =========================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:labels, [:group_id, :title], {:where=>"project_id = NULL::integer", :name=>"index_labels_on_group_id_and_title", :algorithm=>:concurrently})
-> 0.0036s
-- execute("SET statement_timeout TO 0")
-> 0.0006s
-- add_index(:labels, [:group_id, :title], {:where=>"project_id = NULL::integer", :name=>"index_labels_on_group_id_and_title", :algorithm=>:concurrently})
-> 0.0153s
-- execute("RESET ALL")
-> 0.0006s
-- transaction_open?()
-> 0.0000s
-- indexes(:labels)
-> 0.0029s
-- remove_index(:labels, {:algorithm=>:concurrently, :name=>"index_on_labels_group_id_and_title_with_null_project_id"})
-> 0.0015s
== 20201120125953 ReplaceUnusedLabelsIndex: reverted (0.0259s) ================
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team