Improve validation of GitLab pages custom domain name
Summary
On our system, we noticed an increasing number of dead Sidekiq jobs. The majority of these jobs are responsible for the domain name verification process that is necessary for the Let's Encrypt and ACME TLS support.
When investigating further, we found that the issue originates from users adding random test domains to their pages deployment and enabling automated ACME TLS support. The system currently accepts these invalid domain names and attempts to verify them, leading to repeated failures and an accumulation of dead Sidekiq jobs, e.g. 1 failure per 15min per domain.
We have to manually identify these jobs via the rails console and delete them, which is a time-consuming process.
Steps to reproduce
- Go to the domain settings of a GitLab Pages site: http://gdk.test:3000/flightjs/Flight/pages/domains/new
- In the input field
Domain
, enterexample
(instead of a valid top level domain likeexample.com
). - Click on the
Save
button. - Observe that the system accepts this invalid domain name without a warning.
- Notice that the background worker fails when it tries to verify
example
as it is not a valid domain name, leading to dead Sidekiq jobs.
Example Project
What is the current bug behavior?
The system accepts invalid domain names and attempts to verify them, leading to repeated failures and dead Sidekiq jobs.
What is the expected correct behavior?
The system should validate domain names upfront and reject those that are not valid top level domains.
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
System information System: Proxy: no Current User: client-siemens Using RVM: no Ruby Version: 3.2.3 Gem Version: 3.5.11 Bundler Version:2.5.11 Rake Version: 13.0.6 Redis Version: 7.0.14 Sidekiq Version:7.1.6 Go Version: go1.22.3 darwin/arm64 GitLab information Version: 17.1.0-pre Revision: d6355c929d9 Directory: /Users/client-siemens/Development/gitlab-development-kit/gitlab DB Adapter: PostgreSQL DB Version: 14.9 URL: http://gdk.test:3000 HTTP Clone URL: http://gdk.test:3000/some-group/some-project.git SSH Clone URL: ssh://git@gdk.test:2222/some-group/some-project.git Elasticsearch: no Geo: no Using LDAP: no Using Omniauth: yes Omniauth Providers: google_oauth2 GitLab Shell Version: 14.35.0 Repository storages: - default: unix:/Users/client-siemens/Development/gitlab-development-kit/praefect.socket GitLab Shell path: /Users/client-siemens/Development/gitlab-development-kit/gitlab-shell Gitaly - default Address: unix:/Users/client-siemens/Development/gitlab-development-kit/praefect.socket - default Version: 17.0.0-rc2-318-g676cff8cd - default Git Version: 2.45.1
Results of GitLab application Check
Expand for output related to the GitLab application check
Checking GitLab subtasks ...Checking GitLab Shell ...
GitLab Shell: ... GitLab Shell version >= 14.35.0 ? ... OK (14.35.0) Running /Users/client-siemens/Development/gitlab-development-kit/gitlab-shell/bin/check {"content_length_bytes":90,"correlation_id":"01J06PDJNG3S7AK994K3JRNHC8","duration_ms":2853,"level":"info","method":"GET","msg":"Finished HTTP request","status":200,"time":"2024-06-12T17:02:41Z","url":"http://gdk.test:3000/api/v4/internal/check"} Internal API available: OK Redis available via internal API: OK gitlab-shell self-check successful
Checking GitLab Shell ... Finished
Checking Gitaly ...
Gitaly: ... default ... OK
Checking Gitaly ... Finished
Checking Sidekiq ...
Sidekiq: ... Running? ... yes Number of Sidekiq processes (cluster/worker) ... 1/1
Checking Sidekiq ... Finished
Checking Incoming Email ...
Incoming Email: ... Reply by email is disabled in config/gitlab.yml
Checking Incoming Email ... Finished
Checking LDAP ...
LDAP: ... LDAP is disabled in config/gitlab.yml
Checking LDAP ... Finished
Checking GitLab App ...
Database config exists? ... yes Tables are truncated? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Cable config exists? ... yes Resque config exists? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... no Try fixing it: sudo chmod 700 /Users/client-siemens/Development/gitlab-development-kit/gitlab/public/uploads For more information see: doc/install/installation.md in section "GitLab" Please fix the error above and rerun the checks. Uploads directory tmp has correct permissions? ... yes Systemd unit files or init script exist? ... no Try fixing it: Install the Service For more information see: doc/install/installation.md in section "Install the Service" Please fix the error above and rerun the checks. Systemd unit files or init script up-to-date? ... can't check because of previous errors Projects have namespace: ... 22/1 ... yes 24/2 ... yes 24/3 ... yes 27/4 ... yes 29/5 ... yes 31/6 ... yes 33/7 ... yes 35/8 ... yes 58/9 ... yes 10/10 ... yes 49/11 ... yes 12/12 ... yes 13/13 ... yes 43/14 ... yes 9/15 ... yes 56/16 ... yes 6/17 ... yes 17/18 ... yes 1/19 ... yes 1/20 ... yes 1/21 ... yes 1/22 ... yes 1/23 ... yes 1/24 ... yes 29/25 ... yes 104/26 ... yes 106/27 ... yes 109/28 ... yes 113/29 ... yes 115/30 ... yes 117/31 ... yes 119/32 ... yes 1/33 ... yes 125/34 ... yes 128/35 ... yes 127/36 ... yes 1/37 ... yes 132/38 ... yes Redis version >= 6.2.14? ... yes Ruby version >= 3.0.6 ? ... yes (3.2.3) Git user has default SSH configuration? ... no Try fixing it: mkdir ~/gitlab-check-backup-1718211778 sudo mv /Users/client-siemens/.ssh/config ~/gitlab-check-backup-1718211778 sudo mv /Users/client-siemens/.ssh/gitpod ~/gitlab-check-backup-1718211778 sudo mv /Users/client-siemens/.ssh/id_ed25519 ~/gitlab-check-backup-1718211778 sudo mv /Users/client-siemens/.ssh/id_ed25519.pub ~/gitlab-check-backup-1718211778 sudo mv /Users/client-siemens/.ssh/known_hosts.old ~/gitlab-check-backup-1718211778 For more information see: doc/user/ssh.md#overriding-ssh-settings-on-the-gitlab-server Please fix the error above and rerun the checks. Active users: ... 68 Is authorized keys file accessible? ... yes GitLab configured to store new projects in hashed storage? ... yes All projects are in hashed storage? ... yes Elasticsearch version 7.x-8.x or OpenSearch version 1.x ... skipped (Advanced Search is disabled) All migrations must be finished before doing a major upgrade ... skipped (Advanced Search is disabled)
Checking GitLab App ... Finished
Checking GitLab subtasks ... Finished
Possible fixes
Two alternatives:
- Create a custom rake-task as cronjob and delete invalid domains once a day (e.g. too many attempts).
- Create upstream patch to either halt issue process after too many failed attempts
- Provide better upfront validation => GitLab pages: Stricter validation of custom pag... (!156189 - closed) ON HOLD for now as discussed here