Add support for `v2beta2` of the Horizontal Pod Autoscaler Spec
Summary
When using the horizontal pod autoscaler to automatically scale the number of pods in a Gitlab helm installation (in this case, Gitlab.com), we have noticed that we often get erratic scaling behavior due to the spiky nature of the CPU profile we observe. We will often see scale up events, then 10 minutes later, scale down events, then 30 minutes later, scale up events again.
Currently all parts of the chart create HorizontalPodAutoscaler
objects with v1
of the spec (which makes sense, as it is the latest stable version of the spec). The exception is ingress-nginx, which is using the new beta specification, v2beta2
.
The new beta specification has support for some interesting pieces we might find useful
- It has support for scaling off multiple metrics which might be useful for some components we find might be constrained by both memory and CPU
- It allows you to scale off custom metrics
- It allows greater control over the scaling behavior. This will allow us greater control over how quickly to scale up and down, and what period of time to calculate metrics over to determine if scaling is needed. This should allow us to greatly reduce the amount of scaling events we have, and give us a more predictable usage pattern of number of pods/nodes we are using in any point in time.
While it's understandable to give pause for implementing a beta specification, it's worth noting that this specification has been around for years (though been constantly added to), and is actually schedule to be graduated to GA in the next release or so.
https://github.com/kubernetes/enhancements/issues/2702
Steps to reproduce
- Install Gitlab helm chart in a large scale environment, with current HPAs enabled
- Push a large amount of traffic to the installation, and frequently deploy/update the environment
Configuration used
- Default helm values
Current behavior
- Note, depending on traffic and HPA settings, you get frequent scaling in both directions over a very short period of time (especially with rapid deployments of Gitlab)
Expected behavior
- With the settings from
behavior
field in thev2beta2
hpa spec exposed as helm values, an operator can tune them to make scaling up as fast as needed to accomodate rapid traffic spikes, but slow down scale down events in order to reduce pod/cluster thrash
Versions
- Chart: d23729b7
- Platform:
- Cloud: (GKE | AKS | EKS | ?)
- Self-hosted: (OpenShift | Minikube | Rancher RKE | ?)
- Kubernetes: (
kubectl version
)- Client: 1.20
- Server: 1.20
- Helm: (
helm version
)- Client: 3.2.4
- Server: 3.2.4
Relevant logs
(Please provide any relevate log snippets you have collected, using code blocks (```) to format)