Skip to main content

Onboarding your EKS clusters to Compute Copilot for EKS Karpenter

Why use Compute Copilot EKS?

Learn more about how Compute Copilot for EKS can help you to put your EKS cost optimization on auto-pilot here.

Pre-requisites:

  1. You must be logged in to your nOps account.
  2. Your AWS account must be configured to your nOps account.
  3. You must have a Kubernetes cluster with Karpenter installed, version 0.33 or higher.
  4. GP3 storage class must be configured in your Kubernetes cluster for Container Insights and Container Rightsizing.

About Compute Copilot for EKS

Compute Copilot for EKS is a powerful service designed to automatically optimize compute-based workloads, reducing AWS EKS costs by intelligently adjusting your cluster in three key dimensions:

  • Price Efficiency
  • Container Efficiency
  • Node Efficiency

For a detailed breakdown of each dimension and how Compute Copilot enhances them, see the EKS Insights Dashboard Documentation. This documentation explains all the metrics displayed on the EKS page, providing deeper insights into your cluster’s efficiency and potential savings.

Steps to Configure Your EKS Cluster for BC+, Container Rightsizing, and Container Insights

IAM Roles and Permissions Setup

Purpose

To enable our agent stack to function within the EKS clusters, we need to create IAM roles with the following permissions:

  • Store container insights metrics in an S3 bucket for Container Efficiency: The S3 bucket will reside in the customer's AWS account, so the IAM role must be configured to allow access to that specific bucket.

  • Subscribe to an SQS queue for real-time data fetching for the Cluster Dashboard: The SQS queue will reside in the nOps AWS account, so the IAM role needs to be granted permission to access that queue across accounts.

Choosing Between Terraform and CloudFormation

To set up IAM roles and permissions, you can use either Terraform or CloudFormation. The key differences between these options are:

  • Automatic Updates: CloudFormation supports automatic updates when a new version is available, while Terraform requires manual updates.
  • Multi-Region Deployment: CloudFormation allows you to define AWS regions where your clusters exist and applies the setup across all specified regions. With Terraform, you must run the setup separately for each region.

Choose the tool that best fits your operational needs and update strategy.

CloudFormation Setup Steps

  1. Navigate to EKS from the Compute Copilot top menu.

  2. Click on the EKS cluster you want to configure.

  3. Go to the Cluster Configuration section.

  4. Generate an API key from the API Key section and save it for later use.

  5. Click the Setup button for CloudFormation and proceed.

  6. You will be redirected to the CloudFormation stack setup.

  7. Fill in the Input Variables, paying special attention to the Token field, which is not pre-filled. The template accepts the following parameters:

    • IncludeRegions: Comma-separated list of AWS regions where the solution should operate. Defaults to the region where the stack is created if left blank.
    • RoleName: IAM role name to attach the read policy. Created during onboarding for each AWS account into nOps.
    • CreateIAMUser: Boolean (true/false) specifying whether to create an IAM user. This is required (true) if there is no IAM OIDC provider. Default is false.
    • Environment: Specifies the nOps environment where the solution will run. Allowed values: PROD, UAT. Default: PROD.
    • Token: The nOps API token required for authentication. This is sensitive information and will not be logged.
    • AutoUpdate: Determines whether the stack should automatically update when a new version is released. Allowed values: true, false. Default: true.
  8. Enter the saved API key in the Token field.

  9. Run the CloudFormation stack and return to the nOps platform.

  10. On successful execution, Version and Status should display as Configured.

CloudFormation Update Steps

When the CloudFormation stack requires an update, the IAM Roles and Permissions Setup section in the UI will display the Status as "Outdated". To update it, ensure you are authenticated in the correct AWS account, then click "Update". This action will open the AWS Console on the CloudFormation page, where you can proceed with the update.

If the UI displays the Status as "Outdated" and the Version as "N/A", this indicates that your CloudFormation stack is running a version prior to onboarding confirmation support. In this case, you must manually update the stack from the AWS Console by following these steps:

  1. Navigate to the CloudFormation page in the AWS Console.

  2. Locate and click on the nops-container-cost-setup-${account_number} stack.

  3. Click Update.

  4. Select Replace existing template.

  5. Use the following S3 URL for the new template:

    https://nops-rules-lambda-sources.s3.us-west-2.amazonaws.com/container_cost/versions/0.2.0/nops-kubernetes-agent-setup.yml
  6. Update the IncludeRegions parameter if needed.

  7. Provide a nOps Token/API Key (a new API key can be generated in the Configuration tab, where the Helm upgrade command is available).

  8. Click Submit to apply the stack update.

Terraform Setup Steps

  1. Navigate to EKS from the Compute Copilot top menu.

  2. Click on the EKS cluster you want to cost-optimize.

  3. Go to the Cluster Configuration section.

  4. Click the Setup button for Terraform and proceed.

  5. Generate an API key and save it for later use.

  6. Copy and paste the Terraform module call into your Terraform configuration.

  7. Update the Input Variables if necessary. These variables have default values and do not need to be set unless you wish to override them. This module has no required variables.

    • cluster_names: A list of EKS cluster names targeted for deploying resources. Leave it empty to create roles for all EKS clusters in the region. Default is an empty list ([]).
    • create_bucket: Boolean (true/false) indicating whether to create an S3 bucket. If the bucket already exists or is located in another region, set it to false. Default is true.
    • create_iam_user: Boolean (true/false) specifying whether to create an IAM user. This is required (true) if there is no IAM OIDC provider. Default is false.
    • environment: Specifies the nOps environment where the solution will operate. Allowed values: PROD, UAT. Default is PROD.
    • role_name: The IAM role name to attach the read policy. If left empty, it will be automatically fetched. Default is an empty string ("").
  8. Initialize and apply your Terraform configuration:

terraform init
terraform plan -out=plan
terraform apply plan

Install the nOps Agent Stack

Once the Terraform or CloudFormation Setup is done, proceed with the installation of the nOpss Agent Stack within the cluster to begin data collection.

  1. On the same Cluster Configuration page, generate a new API key for the nOps Agent Stack.
  2. Copy the custom command and run it in your command line.
  3. Click Test Connectivity to confirm connectivity with the nOps Agent Stack.

How to Enable Container Rightsizing

After completing the IAM Roles and Permissions Setup and Installing the nOps Agent Stack, you can start configuring container rightsizing to optimize resource usage and reduce costs.

The nOps Container Insights Agent, a core part of the nOps Kubernetes Agent Stack, will now start collecting data on the actual resource consumption of your Kubernetes workloads. This data is used to generate rightsizing recommendations, helping you optimize CPU and memory allocations for better efficiency.

For detailed steps on enabling container rightsizing, refer to the Container Rightsizing Documentation.

Steps to Configure Your EKS Cluster for Price Efficiency

After completing the Terraform or CloudFormation setup and installing the nOps Agent Stack in your cluster, you can proceed with the final configuration for Price Efficiency.

Create EC2NodeClass

EC2NodeClass can be created in two ways:

  1. Auto Configuration

    • For the selected EKS cluster, Select create a NodeClass on Configuration page.

    • Assign a unique name to it.

    • Choose the AMI Family from the dropdown menu.

    • Add Subnet IDs manually or with Search by tags.

    • Add Security Group IDs manually or with Search by tags.

    • For EC2NodeClass, you have to select the IAM Role to be used. In case you're not seeing the desired role in the list, you can manually insert your role name.

    • Configure Metadata Options with user data to give commands after node starts [optional].

    • Create Device Mapping by providing necessary details [optional].

    • Click on Create Automatically and nodeclass will be created.

      note

      you can create multiple node templates

  2. Manual Configuration

    • For the selected EKS cluster, select create a EC2NodeClass.

    • Insert the YAML code and validate it. Make sure you specify an unique name under metadata.name property for your resource

    • Now select Create manually.

      note

      In the NodeClasses list you can find resources created without nOps if there are any. These are going to be shown without the nOps icons and will be available only in Yaml format in View Mode. When creating NodePools, you can select any NodeClass as reference, either created via nOps or directly created in the cluster by you or your team.

Create NodePool

NodePool can be created in Three ways:

  1. Import Your NodePool

    Import your Node Pools, and nOps will handle the rest. Once you import, you can make any changes to the Node Pools in your code repository and they will automatically sync with nOps. This eliminates manual overhead and facilitates centralized management, ensuring your configurations are always in sync and optimized to prioritize performance.

  • To import Node Pools, simply click the Import Nodepool button on the Cluster Configuration page. This will instantly create an nOps copy linked to the original version and send it for deployment into your cluster.

  1. Auto Configuration

    • For the selected EKS cluster, select create NodePool.

    • Assign a unique NodePool name.

    • Select the created EC2NodeClass to pull configuration from.

    • Select Availability Zones.

    • Select Capacity Type — Spot, On Demand or both [recommended to select both]

    • You have the option to select Max Limit of vCPUs & Memory. Using 0 in both fields will indicate Karpenter to apply no limits when allocating resources.

    • Create Taints and Labels if required [for specific provisioner service].

    • For instance selection section, you have the option to filter the types by Architecture, Generation, Accelerator details, Networking, Storage and others. After the filtering is done, you can click Select All Eligible Instance Families for autoselection.

    • Set the weight (optional)

    • Now select Create Automatically.

      note

      You can create multiple NodePools, but each NodePool will have only 1 Node class.

  2. Manual Configuration

    • For the selected EKS cluster, select create NodePool.
    • Insert the YAML code and validate it. Make sure you specify an unique name under metadata.name property for your resource and also add the correct reference to the desired NodeClass under spec.template.spec.nodeClassRef.
    • Now select Create manually.

As soon as cluster status displays Configured, Compute Copilot for EKS will start its magic to generate savings on the connected EKS cluster.

FAQ

  1. Is Karpenter mandatory to install EKS Compute Copilot?

    • No, EKS Compute Copilot now supports both Karpenter and Cluster Autoscaler. However, we recommend migrating from Cluster Autoscaler to Karpenter for better node efficiency and automation, since Karpenter dynamically adjusts node sizes based on workload demands.
    • Note that Karpenter or Cluster Autoscaler are only required for Price Efficiency. Neither is required for Container Rightsizing or Container Insights.
  2. Do EKS Compute Copilot Karpenter NodePools take precedence over my own Karpenter NodePools?

    • Yes. Once onboarded to Compute Copilot, the NodePools already existing in the cluster are not going to be used anymore. But you don’t necessarily need to delete them. It’s just that they will not be used while Compute Copilot NodePools are there. Note: It may not always be true depending on the existing NodePools configuration.
  3. Can we import the YAML for existing node class and NodePools to configure Compute Copilot EKS NodePools?

    • Yes, you can still use your existing YAML code. However, we now offer a new import feature that allows you to import your existing node classes and node pools directly. The Create NodePool section of this documentation explains how to import node classes and node pools, making the configuration process easier.
  4. Does nOps provide DevOps support to customers migrating from cluster autoscaler to Karpenter?

    • nOps has experienced engineers who provide free of cost support to all clients to help them with Karpenter migration. You can get as much on-call support as you need to review your Karpenter settings. We also conduct monthly “Karpenter Office Hours” with our existing customer audience to build a community where we can share our experience, customers can ask questions in an open forum, see hands-on to some basic Karpenter implementation practices and much more. However, nOps is not a Service company and therefore, we do not take responsibility for your migration project or provide dedicated DevOps resources or do managed services. Our support is limited to only providing guidance and reviews. Note: If you want a recording of the most recent “Karpenter Office Hours” then please drop us an email on ‘support@nops.io’.
  5. Does nOps Compute Copilot for EKS support configuration of Multiple NodePools while setting up nOps Karpenter provisioner?

    • Yes, absolutely. Setting up multiple NodePools is a very common thing that customers ask for and we have made it very user friendly to configure that. You get two options to do it, configure in the UI using Auto Config or configure via existing YAML template import using Manual Config method.
  6. I have multiple applications running on the clusters which have different instance requirements, can Compute Copilot support different instance types in such cases?

    • Yes, Compute Copilot does allow you to pick & choose only those instance types that you’d want nOps Karpenter provisioner to provision.
  7. Is it possible to set minimum threshold values for metrics like CPU and Memory while configuring my Karpenter provisioner in Compute Copilot?

    • Yes, we allow users to set the minimum CPU and Memory metrics
  8. Can I put my Stateful workloads on EKS Compute Copilot?

    • EKS Compute Copilot does not come with any limitation on Stateful workloads. However, we do not recommend putting Stateful Workloads on Spot Instances if they are running mission critical operations.
  9. How can I identify resources managed by Compute Copilot EKS?

    • Compute Copilot EKS adds a common tag to all resources it manages. This is accomplished by automatically adding a tag in the spec.tags section of AWSNodeTemplates. The value of the tag is nops:nks:enabled=true and it will appear on all EC2 managed by Compute Copilot EKS.
  10. What is the architecture of the CloudFormation Stack?

    • The stack automates the process of creating and managing IAM roles for EKS clusters, ensuring the proper roles are associated with the service account used by the nOps Agent Stack.

    Cloudformation

    note
    • Optional IAM User Support: The template can also handle situations where an IAM user might be needed, making it suitable for environments lacking OIDC identity providers.
    • Onboarding Confirmation: The Lambdas send a request to nOps to confirm CloudFormation onboarding.
  11. What is the architecture of the nOps Terraform module?

    • The Terraform module is hosted in the public Terraform Registry, allowing customers to use it as a source in their own Terraform configurations.

    For each AWS account and region where clusters exist, customers apply the Terraform module. This process:

    • Creates an S3 bucket and IAM roles for each cluster, enabling the agent to export data.
    • Creates a cross-account role for the backend, allowing it to copy data into the platform.

    Additionally, nOps APIs ensure that:

    • Each cluster has the necessary IAM roles for agent installation.
    • The S3 bucket is registered in the backend table, triggering the data copy workflows.