Compute Copilot Container Rightsizing
About Container Rightsizing
Container rightsizing leverages historical data to generate resource consumption recommendations, aligned with the selected policy. By enabling this feature, the system automates the adjustment of container resource requests, applying the new recommendations through the Vertical Pod Autoscaler (VPA). This approach ensures that deployments are not directly modified, enabling more efficient and dynamic resource allocation while maintaining the integrity of existing configurations.
Compute Copilot Container Rightsizing is currently in Early Access (EA). Please contact customer success if you are interested in participating in the EA program for Container Rightsizing
How Does It Work?
Compute Copilot Container Rightsizing is a data-driven continuous resource optimization platform for workloads in EKS clusters. It supports automatically setting CPU and memory requests on a variety of Kubernetes workloads:
- Deployments
- StatefulSets
- DaemonSets
- CronJobs
The rightsizing process starts by collecting data about the real-world resource consumption of your Kubernetes workloads. This is done by the nOps Container Insights Agent, which is a core part of the nOps Kubernetes Agent stack. This data is ingested into a statistical data analytics pipeline that analyzes the resource consumption of your workloads over the last 30 days at a one-minute resolution. Based on this analysis, optimized recommendations are generated at four different levels. Each level is tailored to a specific use case.
The recommendation levels provided are:
- Maximum savings - most aggressive recommendations for maximum savings
- Balanced savings - a balance of savings and performance, biased toward savings
- Balanced performance - a balance of savings and performance, biased toward performance
- Maximum performance - maximum headroom for maximum workload performance and reliability
In addition to generating these recommendations, the pipeline also looks at the characteristics of the resource usage of your workloads in order to make a suggestion of the appropriate recommendation level. That is, it analyzes the resource usage of your application to see if it has significant peaks or bursts, or if it is more steady state. Using this analysis, it can suggest "maximum savings" as the appropriate recommendation level for a steady state workload, or "maximum performance" for a workload that sees significant peaks in its resource demand, for example. The recommendation at the suggest level is shown on the container rightsizing dashboard. To see what level was suggested, look for the annotations on the recommended CPU and memory settings in the dashboard.
Recommendations are updated on an hourly basis for all workloads.
Policies
Recommendation levels are the foundation for our recommendation policies feature. Recommendation policies enable a user to combine desired recommendation levels at the CPU and memory level with headroom settings to tailor the behavior of automatic recommendations to the needs of your system. User-created policies are on our roadmap, but for now we provided the following pre-made policies:
- Maximum Savings
- CPU Recommendation Level: Maximum Savings
- Memory Recommendation Level: Maximum Savings
- Optimize resource utilization and minimize costs. This policy sets resource limits to the minimum required for your containers to function correctly, based on their observed usage patterns
- High availability
- CPU Recommendation Level: Maximum Performance
- Memory Recommendation Level: Maximum Performance
- Prioritizes resource allocation to ensure consistent uptime and resilience. Provide excess capacity to handle traffic spikes and maintain performance under heavy load conditions, optimizing for reliability over cost savings
- Dynamic
- CPU Recommendation Level: automatically selected
- Memory Recommendation Level: automatically selected
- This Dynamically adjusts both the selected recommendation level and the resource requests based on the observed demand. It helps ensure that your containers have the resources they need while avoiding over-provisioning or under-provisioning.
Default Setting: If no policy is selected then the Maximum Savings metrics chosen by default.
nOps VPA
The nOps Vertical Pod Autoscaler (VPA) is the agent that is deployed into your cluster to automatically apply CPU and memory request recommendations to your selected workloads. The nOps VPA will modify the resource that you select for automatic rightsizing at the pod/container level, not at the controller level (Deployment, DaemonSet, etc.) The nOps VPA will automatically update the CPU and memory requests of the workloads that you select for automatic rightsizing on an hourly basis. A history of the recommendations applied by the nOps VPA can be accessed through the "History" button for each workload on the Container Rightsizing dashboard.
How to Enable Container Right Sizing
Prequisites
- You must be logged in to your nOps account.
- Your AWS account must be configured to your nOps account.
- Your EKS cluster must be onboarded according to the instructions on the Onboarding EKS help page
- After copying the custom command to install the Helm chart for the nOps Kubernetes Agent, but before running it, you must add the following parameter to the command to enable the Container Rightsizing VPA:
--set containerRightsizing.enabled=true
- If you have already onboarded your cluster and installed the nOps Kubernetes Agent, you can re-run the
helm upgrade
command from the cluster configuration pane with the above parameter added to the command.
- After copying the custom command to install the Helm chart for the nOps Kubernetes Agent, but before running it, you must add the following parameter to the command to enable the Container Rightsizing VPA:
How to Enable Automatic Container Rightsizing
Step 1: Access the Compute Copilot Container Rightsizing Tab
Step 2: Enable Container Rightsizing On a Specific Container
How to Disable Container Right Sizing
Once container rightsizing is disabled, the Vertical Pod Autoscaler (VPA) is removed, and the pod will revert to consuming resource requests defined in its controller kind (e.g., Deployment, DaemonSet, etc.). This means that the container will no longer receive automated adjustments based on historical data and will instead rely on the initial configuration set within the controller.
Frequently Asked Questions (FAQ)
Does it overwrite my original workload resources? No, container rightsizing does not overwrite your workload. The Vertical Pod Autoscaler (VPA) updates the container at the pod level, so the original workload configurations, such as those defined in your Deployment or DaemonSet, remain unchanged.
Is there downtime?
No, there is no downtime.
While recommendation updates from the nOps VPA will cause pod restarts, the process is handled in a rolling fashion equivalent to a kubectl rollout restart
.
Should I enable container rightsizing for containers in the kube-system or default namespaces? It is not recommended to enable container rightsizing for containers in the kube-system or default namespaces. These namespaces typically contain Kubernetes-specific workloads that are critical to the cluster's functionality. Modifying the resource requests and limits of these workloads could lead to unforeseen issues or disrupt essential services. It’s best to limit container rightsizing to application-specific namespaces.
How does container rightsizing handle limits? Container rightsizing will respect the limits that you set. If our data analytics make a recommendation that exceed the current configured CPU or memory limit, the requests will be set to the limit values. Container Rightsizing will not change the limits on any of your workloads, and will not set requests higher than limits.
Caveats
-
Showback Delay: Certain parts of the current container rightsizing data analytics pipeline are based on showback and the AWS Cost and Usage Report (CUR). This means that there may be a delay of up to 48 hours before new workloads initially show up in the Container Rightsizing Dashboard. Hourly recommendation updates are not subject to the 48 hour delay. The product roadmap currently includes plans to reduce or eliminate this delay.
-
HPA Not Supported: It is not recommended to use container rightsizing with HPA. The HPA can misinterpret lower resource requests as usage spikes, as it calculates utilization using the formula
Usage / Requests
. This may lead to unintended scaling behavior. It is best to use the VPA independently for optimal performance. -
Official VPA Not Supported: The nOps VPA uses the same CRDs as the official Kubernetes VPA, so installing both the official VPA and the nOps VPA can result in workload instability.
-
Single Replica Workloads: By default, single replica workloads will not receive resource recommendations, even if they are enabled in the UI. Applying recommendations requires a pod restart, which would cause downtime for that workload. If you are interested in automatically rightsizing single replica workloads that are downtime tolerate, contact customer success to discuss your requirements.
-
CronJobs: Our Vertical Pod Autoscaler (VPA) solution fully supports CronJobs, enabling automated rightsizing for resources associated with periodic workloads. To ensure compatibility, the following requirement must be met: In the CronJob specification (spec.jobTemplate.spec.template.metadata.labels), there must be an app label with a value matching the name of the CronJob. For example, if the CronJob is named demo-cronjob, the app label should be:
apiVersion: batch/v1
kind: CronJob
metadata:
name: demo-cronjob
spec:
jobTemplate:
spec:
template:
metadata:
labels:
app: demo-cronjob
spec:
containers:
- name: demo-container
image: demo-image