Custom Metric Scaling

Autoscaling in Kubernetes based on a readiness probe isn’t directly supported. However, you can achieve autoscaling using custom metrics, which can be related to your readiness probe. Here’s an approach you can take:

  1. Create a readiness probe in your application

    Add a readiness probe to your application code to determine if it’s ready to serve traffic. For example, you can use an HTTP endpoint that returns a 200 status code when ready and a non-200 status code when not ready.

  2. Expose custom metrics

    Modify your application to expose custom metrics related to the readiness probe, such as the total number of ready instances or percentage of ready instances. You can use a library like Prometheus client to expose these metrics in a format that’s easy to scrape.

  3. Deploy Prometheus to scrape custom metrics

    Deploy Prometheus in your Kubernetes cluster and configure it to scrape your application’s custom metrics. Follow the Prometheus documentation for details.

  4. Deploy kube-state-metrics (if not already deployed)

    Deploy kube-state-metrics to expose Kubernetes objects’ metrics to Prometheus. Follow the instructions on the GitHub repository.

  5. Deploy Prometheus Adapter

    Deploy the Prometheus Adapter to expose custom metrics from Prometheus as external metrics for the Kubernetes API. Follow the instructions on the GitHub repository.

  6. Configure the Horizontal Pod Autoscaler (HPA) to use custom metrics

    Create an HPA configuration that uses custom metrics related to your readiness probe. For example:

    .. code-block:: yaml
    
       apiVersion: autoscaling/v2beta2
       kind: HorizontalPodAutoscaler
       metadata:
         name: my-app-hpa
       spec:
         scaleTargetRef:
           apiVersion: apps/v1
           kind: Deployment
           name: my-app
         minReplicas: 1
         maxReplicas: 10
         metrics:
         - type: Pods
           pods:
             metric:
               name: my_readiness_metric
             target:
               type: AverageValue
               averageValue: 100
    

    Replace my_readiness_metric with your custom metric name. Adjust minReplicas, maxReplicas, and averageValue according to your needs.

  7. Apply the HPA configuration

    Run kubectl apply -f hpa.yaml to apply the HPA configuration.

This setup will make Kubernetes autoscale your application based on the custom metric derived from your readiness probe. Note that this is an indirect method, and you might need to adjust the scaling thresholds according to your specific use case.