Skip to content

Use full-text search with in param

Heinrich Lee Yu requested to merge fulltext-support-for-in into master

What does this MR do and why?

This allows the use of full-text search when using the in param to search only specific fields.

To do this, we have to build the tsquery ourselves instead of doing websearch_to_tsquery.

And since we're now building the tsquery ourselves, I added some code to enable prefix matching and workaround for searching numbers to handle #375337 (closed) and most of #368843 (closed)

Quoted search

Old query

SELECT "issues".*
FROM "issues"
  INNER JOIN "issue_search_data" ON "issue_search_data"."issue_id" = "issues"."id"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
  AND "issues"."project_id" = 278964
  AND ("issues"."state_id" IN (1))
  AND "issues"."issue_type" IN (0, 1, 2, 4)
  AND "issue_search_data"."search_vector" @@ websearch_to_tsquery('english', 'some "search term"')
  AND (issue_search_data.project_id = issues.project_id)
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 21

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44815

New query

SELECT "issues".*
FROM "issues"
  INNER JOIN "issue_search_data" ON "issue_search_data"."issue_id" = "issues"."id"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
  AND "issues"."project_id" = 278964
  AND ("issues"."state_id" IN (1))
  AND "issues"."issue_type" IN (0, 1, 2, 4)
  AND "issue_search_data"."search_vector" @@ to_tsquery('english', 'some:* & search <-> term')
  AND (issue_search_data.project_id = issues.project_id)
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 21

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44816

Performance should be very similar because they end up being very similar tsqueries anyway.

With in param

Old query (uses non-fulltext search)

SELECT "issues".*
FROM "issues"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
  AND "issues"."project_id" = 278964
  AND "issues"."issue_type" IN (0, 1, 2, 3)
  AND "issues"."title" ILIKE '%some%' AND "issues"."title" ILIKE '%search term%'
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 20 OFFSET 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44818

New query

SELECT "issues".*
FROM "issues"
  INNER JOIN "issue_search_data" ON "issue_search_data"."issue_id" = "issues"."id"
WHERE "issues"."author_id" NOT IN (SELECT "banned_users"."user_id" FROM "banned_users")
  AND "issues"."project_id" = 278964
  AND "issues"."issue_type" IN (0, 1, 2, 4)
  AND "issue_search_data"."search_vector" @@ to_tsquery('english', 'some:*A & search:A <-> term:A')
  AND (issue_search_data.project_id = issues.project_id)
ORDER BY "issues"."created_at" DESC, "issues"."id" DESC LIMIT 20 OFFSET 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/12743/commands/44820

This is much faster now since we can use full text search

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Heinrich Lee Yu

Merge request reports

Loading