Add throttle definition for unauthenticated HTTP Git operations
What does this MR do and why?
What? We propose to define an additional throttle definition throttle_unauthenticated_git_http
for all unauthenticated git requests. Furthermore, we propose to exclude git-related http requests from the throttle definition throttle_unauthenticated_web
. This allows GitLab instance admins to adjust the throttling parameters in a more fine-grained manner.
We also considered the approach to define ip addresses allowlist, e.g. the ip address of the ci runners. This allowlist could be used to excluded / ignored the ip address for the throttle definition throttle_unauthenticated_web
. We disregarded this approach because it limits the flexibility of our ci runners even though our users could potentially DDoS ourselves
Why?
We (Siemens-internal GitLab team managing a self-managed GitLab instance) have been observing spikes in 429 HTTP errors in our monitoring system. Upon investigation, we found that these errors are triggered by the GitLab::RackAttack
throttle definition throttle_unauthenticated_web
specifically when git-related HTTP requests are sent to the GitLab backend, such as GET "/namespace/project-repo.git/info/refs?service=git-upload-pack"
, etc.
Further investigation revealed a common scenario contributing to the spike:
- CI pipeline of a project needs to access several git repositories of (internal) projects hosted on the self-managed GitLab instance, e.g. for scanning purposes or other reasons
- When cloning the project repo (i.e. in the CI pipeline), the auth credentials (basic auth) are also integrated in the
git
command, e.g.git clone https://username:token@self-managed-gitlab/repository_url.git
- The
git clone
command issues a series of web requests to the GitLab backend - Out of these requests, the first request does not include the auth credentials because git clients seems to work like this, see first info box in GitLab documentation, discussion thread in previous issue and here.
- This means, the first request is considered an unauthenticated web request (normal unauthencated web traffic) and eventually throttled by the mentioned
GitLab::RackAttack
throttle definitionthrottle_unauthenticated_web
(when a large number of projects are cloned in parallel) - During this throttling period, other unautheticated web requests (and the retries) are also throttled and accumulated which leads to the spike and to degraded experience.
💥 - But still, the git requests are considered unauthenticated web requests and therefore throttled by the mentioned
GitLab::RackAttack
throttle definitionthrottle_unauthenticated_web
We tried to mitigate this issue by increasing the rate limits. But, we continue to experience spikes in 429 HTTP error as our user base and usage grow.
Why should the GitLab team integrate this MR? The motivation behind this MR is to:
- Address these 429 HTTP error spikes caused by unauthenticated git-related web requests (the first unauthticated request of HTTP Git operations)
- Ensure that our GitLab instance remains reliable and performant even as our user base and usage continue to expand
- Maintain overall user experience and high quality of the self-managed GitLab service.
- Potentially remove the info box
By default, all Git operations are first tried unauthenticated. Because of this, HTTP Git operations may trigger the rate limits configured for unauthenticated requests.
in the GitLab rate limits documentation.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
MR Checklist (@gerardo-navarro)
-
Changelog entry added, if necessary -
Documentation created/updated via this MR -
Documentation reviewed by technical writer or follow-up review issue created -
Tests added for this feature/bug -
Tested in all supported browsers -
Conforms to the code review guidelines -
Conforms to the merge request performance guidelines -
Conforms to the style guides -
Conforms to the javascript style guides -
Conforms to the database guides
Screenshots or screen recordings
The following screencasts wants to illustrate the new behavior introduced by this MR. In both screencasts, the script bash ./git_clone_in_parallel.sh
is used to clone three local git repostories in parallel.
Open the bash script `git_clone_in_parallel.sh`
#!/bin/bash
# Number of clones
num_clones=4
clone_output_dir_prefix="cloned-repos"
echo "Removing old cloned repositories"
rm -rf $clone_output_dir_prefix
echo "Removed old cloned repositories"
# Use a loop to clone the repository multiple times
for i in $(seq 2 $num_clones); do
# Repository to clone
repo="http://root:5iveL!fe12345@gdk.test:3000/root/test-space-unauthenticated-web-$i.git"
# Create a new directory for each clone
clone_output_dir="$clone_output_dir_prefix/test-space-unauthenticated-web-$i"
# Clone the repository into the new directory
(echo "Starting cloning $clone_output_dir" && git clone $repo $clone_output_dir && echo "Finished cloning $clone_output_dir") &
done
# Wait for all background jobs to finish
wait
Before (branch master ) |
After (this MR branch) |
---|---|
MR Throttle unauthenticated Git HTTP requests / Screencast of existing behavior on branch master : https://www.loom.com/share/9365997ff833491aa60952b65686a5c6
|
MR Throttle unauthenticated Git HTTP requests / Screencast of new behavior on MR branch: https://www.loom.com/share/ea9050e74fea4492a2d574db1170f9d9 |
This MR adds the new throttle rate limits in the admin network settings, see Screenshot below.
How to set up and validate locally
- Migrate the database
rails db:migrate
- In the admin network settings, enable unauthenticated Git HTTP request rate limit and set the rate limit settings accordingly (<= you can also increase the period in seconds parameter to have more time to clone the git repos within the throttling period)
- Do not forget to click the button
Save changes
- Create three new blank projects (with README) through the web UI: http://gdk.test:3000/projects/new ; we will
git clone
the project's git repositories in parallel - Try cloning the three git repositories in parallel, i.e.
git clone http://gdk.test:3000/xxxx
; NOTE: when you have quick hands😄 , you can do this manually; or, you can use the script added to the screencast section - One
git clone
command should lead to a 429 HTTP error💥 - Now, disable unauthenticated Git HTTP request rate limit in the admin network settings
- Wait or restart the server reset the throttling cache
- Again, try cloning the three git repositories in parallel => it should now be successful
🚀