Increase MaxIdleConnsPerHost in http.Transport from 2 to 100
While investigating gitlab-com/gl-infra/production#1999 (closed), we found out that golang's http.Client
actually has an internal connection pooler called http.Transport
.
Summary:
If we look at production, gitlab-pages
is opening tens of connections to gitlab-api
per second.
The hypothesis for why this happens is that we have lots of concurrent requests going through the transport. This transport has a few parameters that control its behaviour. The one we are concerned with in this case is MaxIdleConnsPerHost
.
MaxIdleConnsPerHost
defines defines the threshold for burst capacity. Concurrent requests up to MaxIdleConnsPerHost
will get connections that are pooled and reused.
If the number of concurrent requests exceeds it, all of those "extra" requests will get their own connection established on demand ("burst capacity"), but that connection will not go back into the pool, hence not be reused. Instead it will be closed after the request is done.
In other words: If we have more than MaxIdleConnsPerHost
concurrent requests, we will constantly be opening and closing connections.
The default value for MaxIdleConnsPerHost
is 2. This patch increases the parameter to 100.
We can make it user-tweakable in the future, but this should be high enough for quite a while.
Impact:
This connection churn creates other issues, in our case it exhausted ports on the NAT, leading to SYN drops, and eventually an outage of the service (gitlab-com/gl-infra/production#1999 (closed)).
cc @grzesiek