Skip to content

Use new Sanitizable concern in User

What does this MR do?

this merge request leverages the new Sanitizable concern introduced in Add ability to easily sanitize attributes and replaces adhoc sanitizing in different parts of the codebase. the intention is to help prevent xss attacks in the event of a by-pass in the frontend sanitizer due to a configuration issue or a vulnerability in the sanitizer. this approach is commonly referred to as defense-in-depth.

Implications

this means that in addition to the existing behaviour escaped html entities (e.g. &) will no longer be permitted and will result in a validation error. the reasons behind this are discussed in Add ability to easily sanitize attributes.

Investigation

linkedin

usernames can only contain alphanumeric characters .

Your custom URL can be 3-100 characters long. Don't use spaces, symbols, or special characters

https://www.linkedin.com/help/linkedin/answer/87/customize-your-public-profile-url

analysis

46_330 non-alphanumeric twitter usernames on gitlab.com.

select count(*) from users where (users.linkedin <> '') and users.linkedin not like '%[a-zA-Z0-9 .-]%';
 Aggregate  (cost=2342841.30..2342841.31 rows=1 width=8) (actual time=4778.026..4802.490 rows=1 loops=1)
   Buffers: shared hit=36362 read=534443
   I/O Timings: read=11079.894 write=0.000
   ->  Gather  (cost=2342841.09..2342841.30 rows=2 width=8) (actual time=4777.849..4802.439 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=36362 read=534443
         I/O Timings: read=11079.894 write=0.000
         ->  Aggregate  (cost=2341841.09..2341841.10 rows=1 width=8) (actual time=4771.986..4771.990 rows=1 loops=3)
               Buffers: shared hit=36362 read=534443
               I/O Timings: read=11079.894 write=0.000
               ->  Parallel Seq Scan on public.users  (cost=0.00..2341699.08 rows=56802 width=0) (actual time=1.646..4762.105 rows=46330 loops=3)
                     Filter: (((users.linkedin)::text <> ''::text) AND ((users.linkedin)::text !~~ '%[a-zA-Z0-9 .-]%'::text))
                     Rows Removed by Filter: 3072957
                     Buffers: shared hit=36362 read=534443
                     I/O Timings: read=11079.894 write=0.000

skype

new usernames appear to be computer generated (e.g. live:.cid.0000000000000000) but were user-defined in the past. when they were user defined some specicial characters were allowed but none of were html entities.

A Skype username cannot be shorter than six characters or longer than 32. It can contain both letters and numbers, but must start with a letter; accented characters are not allowed. The only punctuation marks you can use are commas, dashes, periods and underscores.

https://itstillworks.com/skype-username-tips-22188.html

analysis

29_962 non-alphanumeric twitter usernames on gitlab.com.

select count(*) from users where (users.skype <> '') and users.skype not like '%[a-zA-Z0-9 .-]%';
 Aggregate  (cost=2342791.40..2342791.41 rows=1 width=8) (actual time=5193.205..5220.942 rows=1 loops=1)
   Buffers: shared hit=36319 read=534545
   I/O Timings: read=12142.436 write=0.000
   ->  Gather  (cost=2342791.19..2342791.40 rows=2 width=8) (actual time=5192.683..5220.923 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=36319 read=534545
         I/O Timings: read=12142.436 write=0.000
         ->  Aggregate  (cost=2341791.19..2341791.20 rows=1 width=8) (actual time=5186.033..5186.035 rows=1 loops=3)
               Buffers: shared hit=36319 read=534545
               I/O Timings: read=12142.436 write=0.000
               ->  Parallel Seq Scan on public.users  (cost=0.00..2341699.08 rows=36842 width=0) (actual time=1.525..5178.115 rows=29962 loops=3)
                     Filter: (((users.skype)::text <> ''::text) AND ((users.skype)::text !~~ '%[a-zA-Z0-9 .-]%'::text))
                     Rows Removed by Filter: 3089325
                     Buffers: shared hit=36319 read=534545
                     I/O Timings: read=12142.436 write=0.000

twitter

usernames can only contain alphanumeric characters and underscores.

A username can only contain alphanumeric characters (letters A-Z, numbers 0-9) with the exception of underscores, as noted above. Check to make sure your desired username doesn't contain any symbols, dashes, or spaces.

https://help.twitter.com/en/managing-your-account/twitter-username-rules

analysis

50_633 non-alphanumeric twitter usernames on gitlab.com.

select count(*) from users where (users.twitter <> '') and users.twitter not like '%[a-zA-Z0-9 .-]%';
 Aggregate  (cost=2342855.95..2342855.96 rows=1 width=8) (actual time=4822.207..4850.010 rows=1 loops=1)
   Buffers: shared hit=36170 read=534635 dirtied=9588 written=9573
   I/O Timings: read=10925.721 write=201.099
   ->  Gather  (cost=2342855.74..2342855.95 rows=2 width=8) (actual time=4821.739..4850.000 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=36170 read=534635 dirtied=9588 written=9573
         I/O Timings: read=10925.721 write=201.099
         ->  Aggregate  (cost=2341855.74..2341855.75 rows=1 width=8) (actual time=4815.522..4815.524 rows=1 loops=3)
               Buffers: shared hit=36170 read=534635 dirtied=9588 written=9573
               I/O Timings: read=10925.721 write=201.099
               ->  Parallel Seq Scan on public.users  (cost=0.00..2341699.08 rows=62663 width=0) (actual time=0.956..4805.501 rows=50633 loops=3)
                     Filter: (((users.twitter)::text <> ''::text) AND ((users.twitter)::text !~~ '%[a-zA-Z0-9 .-]%'::text))
                     Rows Removed by Filter: 3068654
                     Buffers: shared hit=36170 read=534635 dirtied=9588 written=9573
                     I/O Timings: read=10925.721 write=201.099

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Philip Cunningham

Merge request reports

Loading