Skip to content

Allow secondary emails in user search

Catalin Irimie requested to merge cat-user-search-secondary-emails into master

What does this MR do?

Allows searching users by email using secondary emails as well - implements #26110 (closed).

User searching by email only happened with an equality check for the primary email. We already had a search_with_secondary_emails helper however inefficient because it was using an IN condition.

Since there is a unique constraint on secondary emails as well, that can be updated with a strict equality check and make the query perform similarly to the one without secondary emails check.

That allows us to default to searching through secondary emails as well (when using the API, project member list etc).

Uses the :user_search_secondary_email feature flag, rollout issue in #282137 (closed).

Query plans

Examples using my (primary) email address:

Before

Depesz and raw SQL

Expand for plan & raw SQL
SELECT
    "users".*
FROM
    "users"
WHERE (("users"."name" ILIKE '%cirimie@gitlab.com%'
        OR "users"."username" ILIKE '%cirimie@gitlab.com%')
    OR "users"."email" = 'cirimie@gitlab.com')
ORDER BY
    CASE WHEN users.name = 'cirimie@gitlab.com' THEN
        0
    WHEN users.username = 'cirimie@gitlab.com' THEN
        1
    WHEN users.email = 'cirimie@gitlab.com' THEN
        2
    ELSE
        3
    END,
    "users"."name" ASC
Limit  (cost=3236.63..3236.68 rows=20 width=1266) (actual time=83.767..83.767 rows=1 loops=1)
  Buffers: shared hit=2586 read=145 written=9
  I/O Timings: read=5.174 write=0.365
  ->  Sort  (cost=3236.63..3240.22 rows=1436 width=1266) (actual time=83.765..83.766 rows=1 loops=1)
        Sort Key: (CASE WHEN ((name)::text = 'cirimie@gitlab.com'::text) THEN 0 WHEN ((username)::text = 'cirimie@gitlab.com'::text) THEN 1 WHEN ((email)::text = 'cirimie@gitlab.com'::text) THEN 2 ELSE 3 END), name
        Sort Method: quicksort  Memory: 27kB
        Buffers: shared hit=2586 read=145 written=9
        I/O Timings: read=5.174 write=0.365
        ->  Bitmap Heap Scan on users  (cost=807.41..3198.41 rows=1436 width=1266) (actual time=83.735..83.735 rows=1 loops=1)
              Recheck Cond: (((name)::text ~~* '%cirimie@gitlab.com%'::text) OR ((username)::text ~~* '%cirimie@gitlab.com%'::text) OR ((email)::text = 'cirimie@gitlab.com'::text))
              Heap Blocks: exact=1
              Buffers: shared hit=2580 read=145 written=9
              I/O Timings: read=5.174 write=0.365
              ->  BitmapOr  (cost=807.41..807.41 rows=1436 width=0) (actual time=83.708..83.708 rows=0 loops=1)
                    Buffers: shared hit=2579 read=145 written=9
                    I/O Timings: read=5.174 write=0.365
                    ->  Bitmap Index Scan on index_users_on_name_trigram  (cost=0.00..426.75 rows=700 width=0) (actual time=45.048..45.048 rows=0 loops=1)
                          Index Cond: ((name)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1453 read=89 written=8
                          I/O Timings: read=1.598 write=0.328
                    ->  Bitmap Index Scan on index_users_on_username_trigram  (cost=0.00..377.52 rows=736 width=0) (actual time=38.499..38.499 rows=0 loops=1)
                          Index Cond: ((username)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1123 read=55 written=1
                          I/O Timings: read=3.560 write=0.037
                    ->  Bitmap Index Scan on users_email_key  (cost=0.00..2.07 rows=1 width=0) (actual time=0.159..0.159 rows=1 loops=1)
                          Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
                          Buffers: shared hit=3 read=1
                          I/O Timings: read=0.015
Planning Time: 15.935 ms
Execution Time: 84.005 ms

After

Depesz and raw SQL

Expand for plan & raw SQL
SELECT
    "users".*
FROM
    "users"
WHERE ((("users"."name" ILIKE '%cirimie@gitlab.com%'
            OR "users"."username" ILIKE '%cirimie@gitlab.com%')
        OR "users"."email" = 'cirimie@gitlab.com')
    OR "users"."id" = (
        SELECT
            "emails"."user_id"
        FROM
            "emails"
        WHERE
            "emails"."email" = 'cirimie@gitlab.com'))
ORDER BY
    CASE WHEN users.name = 'cirimie@gitlab.com' THEN
        0
    WHEN users.username = 'cirimie@gitlab.com' THEN
        1
    WHEN users.email = 'cirimie@gitlab.com' THEN
        2
    ELSE
        3
    END,
    "users"."name" ASC
Limit  (cost=3075.22..3075.27 rows=20 width=1266) (actual time=65.071..65.071 rows=1 loops=1)
  Buffers: shared hit=2543 read=76
  I/O Timings: read=7.962
  InitPlan 1 (returns $0)
    ->  Index Scan using emails_email_key on emails  (cost=0.42..3.44 rows=1 width=4) (actual time=0.061..0.061 rows=0 loops=1)
          Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
          Buffers: shared hit=3
  ->  Sort  (cost=3071.78..3075.38 rows=1437 width=1266) (actual time=65.069..65.070 rows=1 loops=1)
        Sort Key: (CASE WHEN ((users.name)::text = 'cirimie@gitlab.com'::text) THEN 0 WHEN ((users.username)::text = 'cirimie@gitlab.com'::text) THEN 1 WHEN ((users.email)::text = 'cirimie@gitlab.com'::text) THEN 2 ELSE 3 END), users.name
        Sort Method: quicksort  Memory: 27kB
        Buffers: shared hit=2543 read=76
        I/O Timings: read=7.962
        ->  Bitmap Heap Scan on users  (cost=637.21..3033.54 rows=1437 width=1266) (actual time=65.034..65.035 rows=1 loops=1)
              Recheck Cond: (((name)::text ~~* '%cirimie@gitlab.com%'::text) OR ((username)::text ~~* '%cirimie@gitlab.com%'::text) OR ((email)::text = 'cirimie@gitlab.com'::text) OR (id = $0))
              Heap Blocks: exact=1
              Buffers: shared hit=2537 read=76
              I/O Timings: read=7.962
              ->  BitmapOr  (cost=637.21..637.21 rows=1437 width=0) (actual time=64.996..64.996 rows=0 loops=1)
                    Buffers: shared hit=2536 read=76
                    I/O Timings: read=7.962
                    ->  Bitmap Index Scan on index_users_on_name_trigram  (cost=0.00..420.75 rows=700 width=0) (actual time=42.937..42.937 rows=0 loops=1)
                          Index Cond: ((name)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1494 read=44
                          I/O Timings: read=6.909
                    ->  Bitmap Index Scan on index_users_on_username_trigram  (cost=0.00..211.02 rows=736 width=0) (actual time=21.882..21.882 rows=0 loops=1)
                          Index Cond: ((username)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1036 read=31
                          I/O Timings: read=1.040
                    ->  Bitmap Index Scan on users_email_key  (cost=0.00..2.07 rows=1 width=0) (actual time=0.106..0.106 rows=1 loops=1)
                          Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
                          Buffers: shared hit=3 read=1
                          I/O Timings: read=0.012
                    ->  Bitmap Index Scan on users_pkey  (cost=0.00..1.94 rows=1 width=0) (actual time=0.067..0.067 rows=0 loops=1)
                          Index Cond: (id = $0)
                          Buffers: shared hit=3
Planning Time: 15.475 ms
Execution Time: 65.250 ms

Without the IN to EQ conversion

Depesz and raw SQL

Expand for plan & raw SQL
SELECT
    "users".*
FROM
    "users"
WHERE ((("users"."name" ILIKE '%cirimie@gitlab.com%'
            OR "users"."username" ILIKE '%cirimie@gitlab.com%')
        OR "users"."email" = 'cirimie@gitlab.com')
    OR "users"."id" IN (
        SELECT
            "emails"."user_id"
        FROM
            "emails"
        WHERE
            "emails"."email" = 'cirimie@gitlab.com'))
ORDER BY
    CASE WHEN users.name = 'cirimie@gitlab.com' THEN
        0
    WHEN users.username = 'cirimie@gitlab.com' THEN
        1
    WHEN users.email = 'cirimie@gitlab.com' THEN
        2
    ELSE
        3
    END,
    "users"."name" ASC
Limit  (cost=1884193.14..1884195.47 rows=20 width=1266) (actual time=6649.961..6649.963 rows=1 loops=1)
  Buffers: shared hit=144594 read=1694 written=58
  I/O Timings: read=129.816 write=2.880
  ->  Gather Merge  (cost=1884193.14..2241904.11 rows=3065880 width=1266) (actual time=6649.959..7213.782 rows=1 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=437345 read=5103 written=181
        I/O Timings: read=402.003 write=6.289
        ->  Sort  (cost=1883193.12..1887025.47 rows=1532940 width=1266) (actual time=6643.448..6643.449 rows=0 loops=3)
              Sort Key: (CASE WHEN ((users.name)::text = 'cirimie@gitlab.com'::text) THEN 0 WHEN ((users.username)::text = 'cirimie@gitlab.com'::text) THEN 1 WHEN ((users.email)::text = 'cirimie@gitlab.com'::text) THEN 2 ELSE 3 END), users.name
              Sort Method: quicksort  Memory: 27kB
              Worker 0:  Sort Method: quicksort  Memory: 25kB
              Worker 1:  Sort Method: quicksort  Memory: 25kB
              Buffers: shared hit=437345 read=5103 written=181
              I/O Timings: read=402.003 write=6.289
              ->  Parallel Seq Scan on users  (cost=3.44..1842402.13 rows=1532940 width=1266) (actual time=6643.258..6643.374 rows=0 loops=3)
                    Filter: (((name)::text ~~* '%cirimie@gitlab.com%'::text) OR ((username)::text ~~* '%cirimie@gitlab.com%'::text) OR ((email)::text = 'cirimie@gitlab.com'::text) OR (hashed SubPlan 1))
                    Rows Removed by Filter: 2452186
                    Buffers: shared hit=437307 read=5103 written=181
                    I/O Timings: read=402.003 write=6.289
                    SubPlan 1
                      ->  Index Scan using emails_email_key on emails  (cost=0.42..3.44 rows=1 width=4) (actual time=0.107..0.108 rows=0 loops=3)
                            Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
                            Buffers: shared hit=11
Planning Time: 3.718 ms
Execution Time: 7213.945 ms

7 seconds since it wouldn't be able to use Bitmap scans anymore

Screenshots (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Catalin Irimie

Merge request reports

Loading