Custom Metric Scaling

Autoscaling in Kubernetes based on a readiness probe isn’t directly supported. However, you can achieve autoscaling using custom metrics, which can be related to your readiness probe. Here’s an approach you can take:

Create a readiness probe in your application

Add a readiness probe to your application code to determine if it’s ready to serve traffic. For example, you can use an HTTP endpoint that returns a 200 status code when ready and a non-200 status code when not ready.
Expose custom metrics

Modify your application to expose custom metrics related to the readiness probe, such as the total number of ready instances or percentage of ready instances. You can use a library like Prometheus client to expose these metrics in a format that’s easy to scrape.
Deploy Prometheus to scrape custom metrics

Deploy Prometheus in your Kubernetes cluster and configure it to scrape your application’s custom metrics. Follow the Prometheus documentation for details.
Deploy kube-state-metrics (if not already deployed)

Deploy kube-state-metrics to expose Kubernetes objects’ metrics to Prometheus. Follow the instructions on the GitHub repository.
Deploy Prometheus Adapter

Deploy the Prometheus Adapter to expose custom metrics from Prometheus as external metrics for the Kubernetes API. Follow the instructions on the GitHub repository.

Configure the Horizontal Pod Autoscaler (HPA) to use custom metrics

Create an HPA configuration that uses custom metrics related to your readiness probe. For example:

.. code-block:: yaml

   apiVersion: autoscaling/v2beta2
   kind: HorizontalPodAutoscaler
   metadata:
     name: my-app-hpa
   spec:
     scaleTargetRef:
       apiVersion: apps/v1
       kind: Deployment
       name: my-app
     minReplicas: 1
     maxReplicas: 10
     metrics:
     - type: Pods
       pods:
         metric:
           name: my_readiness_metric
         target:
           type: AverageValue
           averageValue: 100

Replace my_readiness_metric with your custom metric name. Adjust minReplicas, maxReplicas, and averageValue according to your needs.

Apply the HPA configuration

Run kubectl apply -f hpa.yaml to apply the HPA configuration.

This setup will make Kubernetes autoscale your application based on the custom metric derived from your readiness probe. Note that this is an indirect method, and you might need to adjust the scaling thresholds according to your specific use case.