You are going to need
First of all you need to be connected to your Kubernetes cluster in 1.13+, and have the v2beta2 of the autoscaling API. (This is important because without this your HPA will not be able to read custom metrics)
Make sure to have enough nodes to scale your cluster.
The Prometheus operator allows storing of Kubernetes metrics and your custom metrics!
To install Prometheus in our cluster we used the Prometheus helm operator:
To install it we used
helm install prometheus-operator -f prometheus-operator-value.yaml stable/prometheus-operator
This deploys Prometheus, Grafana, alert manager, etc.
You can now access Prometheus dashboard with the following command:
kubectl port-forward svc/prometheus-operator-prometheus 8002:9090
For Grafana
kubectl port-forward svc/prometheus-operator-grafana 8002:80
To allow Kubernetes to read metrics from Prometheus an adapter is needed.
We used the helm chart to install it in our cluster
To install
helm install -f prometheus-adapter-value.yaml prometheus-adapter stable/prometheus-adapter
When installed you can use the following command to see all the metrics that are now exposed to Kubernetes
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/"
or
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/" | jq/
To scarp data from our RabbitMQ deployment and make them available for Prometheus we need to deploy an exporter pod that will do that for use.
We used the Prometheus exporter
Now that you have configured Prometheus and your exporter you should be able to see date in K8s metric API.
Now to query specific information within a metric we need to query Prometheus. To do so we need to create a PrometheusRule.
This configuration will expose a new metric for the HPA to consume.
Here it will be the number of messages within a specific a RabbitMQ queue.
This is the syntax of it :
Now you can configure your HPA (Horizontal pod autoscaling) with a custom metric.
Done you should now be able to see your metrics when describing your HPA.
Taking some time to figure out your KPI for scaling will make the difference between a successful or failure to manage a surge in traffic.
Refining an autoscaling rule or HPA for Kubernetes is a forced path for any resilient architecture. Here the example is the size of a RabbitMQ queue, but it requires the consumers to process message at a fix rate. It has to be constant because if some messages take 1 hour to be processed vs 2 sec for others the scaling will not be reliable.
You can make sure your infrastructure is robust by using chaos engineering.