Autoscaling Configuration

Autoscaling ensures that your applications have the right number of pods to handle the workload efficiently. This section guides you through configuring the autoscaling settings for each environment.

Accessing Autoscaling Settings

To configure autoscaling, select the environment you wish to adjust and click on the edit icon. Here, in the Autoscaling section you can enable or adjust the autoscaling parameters.

Configuration Options

  • Enable Horizontal Pod Autoscaling (HPA): Toggle this option to enable or disable HPA. HPA automatically adjusts the number of pods in a deployment, replication controller, or replica set based on observed CPU and memory utilization.

  • Minimum Number of Replicas: Specify the minimum number of pod replicas. This is optional, and if not set, defaults to 1. Setting a minimum is crucial for ensuring that your application maintains the necessary availability.

  • Maximum Number of Replicas: Define the maximum number of pod replicas. This field is required to prevent unlimited scaling, which could lead to resource exhaustion.

  • Target CPU Utilization Percentage: Enter the target CPU utilization percentage for triggering scaling actions. If you’re unsure of what value to set, 70% is a recommended starting point. This setting helps balance between over-provisioning (wasting resources) and under-provisioning (insufficient resources to handle the load). Note that

  • Target Memory Utilization Percentage: Similar to CPU utilization, this optional setting specifies the target memory utilization percentage for scaling. A 70% target is generally a good starting point, ensuring efficient resource use without compromising performance.

Recommendations

  • Testing: After configuring autoscaling, monitor your environment to ensure it scales as expected under different loads. Adjust the settings as necessary based on performance and resource utilization.

  • Evaluation Period: Give the system some time to evaluate the need for scaling actions. HPA does not immediately adjust the number of pods in response to spikes in utilization, as it waits to confirm that the change in load is not temporary.

Saving Changes

After making the necessary edits, click the “Save” button, and a Pull Request will be created for you to approve. It’s advisable to review the environment’s status and functionality after editing to ensure all services are operating as expected.