Use full-text search with in param
What does this MR do and why?
This allows the use of full-text search when using the in
param to search only specific fields.
To do this, we have to build the tsquery ourselves instead of doing websearch_to_tsquery
.
And since we're now building the tsquery ourselves, I added some code to enable prefix matching and workaround for searching numbers to handle #375337 (closed) and most of #368843 (closed)
Quoted search
Old query
SELECT "issues".*
FROM "issues"
INNER JOIN "issue_search_data" ON "issue_search_data"."issue_id" = "issues"."id"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
AND "issues"."project_id" = 278964
AND ("issues"."state_id" IN (1))
AND "issues"."issue_type" IN (0, 1, 2, 4)
AND "issue_search_data"."search_vector" @@ websearch_to_tsquery('english', 'some "search term"')
AND (issue_search_data.project_id = issues.project_id)
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 21
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44815
New query
SELECT "issues".*
FROM "issues"
INNER JOIN "issue_search_data" ON "issue_search_data"."issue_id" = "issues"."id"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
AND "issues"."project_id" = 278964
AND ("issues"."state_id" IN (1))
AND "issues"."issue_type" IN (0, 1, 2, 4)
AND "issue_search_data"."search_vector" @@ to_tsquery('english', 'some:* & search <-> term')
AND (issue_search_data.project_id = issues.project_id)
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 21
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44816
Performance should be very similar because they end up being very similar tsqueries anyway.
in
param
With Old query (uses non-fulltext search)
SELECT "issues".*
FROM "issues"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
AND "issues"."project_id" = 278964
AND "issues"."issue_type" IN (0, 1, 2, 3)
AND "issues"."title" ILIKE '%some%' AND "issues"."title" ILIKE '%search term%'
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 20 OFFSET 0
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44818
New query
SELECT "issues".*
FROM "issues"
INNER JOIN "issue_search_data" ON "issue_search_data"."issue_id" = "issues"."id"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
AND "issues"."project_id" = 278964
AND "issues"."issue_type" IN (0, 1, 2, 4)
AND "issue_search_data"."search_vector" @@ to_tsquery('english', 'some:*A & search:A <-> term:A')
AND (issue_search_data.project_id = issues.project_id)
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 20 OFFSET 0
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44820
This is much faster now since we can use full text search
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.