# Add Node Pools to a AWS EKS Cluster

Different ML workloads need different compute resources. Sometimes, 2 CPUs is enough, but other times you need 2 GPUs. With the benefit of Kubernetes, you can have multiple node pools, each containing different types of instances/machines. With the addition of auto-scaling, you can make sure they are only live when they are being used.

This is very useful when setting up a Workers cluster for running your cnvrg jobs. In this guide we will explain how to add extra node pools to an existing AWS EKS cluster.

In this guide, you will learn how to:

  • Create new node pools in your existing EKS cluster using eksctl.

# Prerequisites: prepare your local environment

Before you can complete the installation you must install and prepare the following dependencies on your local machine:

# Create the YAML Recipe for the Node Pools

We will use eksctl to add the node pools to the cluster in AWS. To use eksctl, you will need to create a YAML file that provides the necessary configuration for the cluster.

In the file, you must set:

  • name: the name of the cluster you are adding the node pools to.
  • region: the name of the Amazon region in which the cluster resides.

You can of course modify the details of the node pools to your own custom requirements. Within the example file below, there are two node pools, a CPU example and a GPU example. The example file is commented with details about the information and customizability of each entry. You can add or subtract nodeGroup entries to construct all the node pools you require.

Copy the following text into a new file called nodes.yaml and then edit it accordingly:

---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: <cluster-name> # This should be the name of the existing cluster.
  region: <amazon-region> # This should be the Amazon region the cluster is in.
nodeGroups:
  - name: cpu-pool # Customizable. The name of the node pool.
    instanceType: m5.xlarge # Customizable. The instance type for the node pool.
    volumeSize: 100 # Required. The disk size for the nodes.
    minSize: 1 # Customizable. The minimum amount of nodes to auto scale down to.
    maxSize: 5 # Customizable. The maximum amount of nodes to auto scale up to.
    desiredCapacity: 2 # Customizable. The initial amount of nodes to have live.
    privateNetworking: true # Required.
    iam:
      withAddonPolicies: 
        autoScaler: true # Required.
        imageBuilder: true # Required.
    tags:
      k8s.io/cluster-autoscaler/enabled: 'true' # Required.
  - name: gpu-pool # Customizable. The name of the node pool.
    instanceType: g3s.xlarge # Customizable. The instance type for the node pool.
    volumeSize: 100 # Required. The disk size for the nodes.
    minSize: 1 # Customizable. The minimum amount of nodes to auto scale down to.
    maxSize: 5 # Customizable. The maximum amount of nodes to auto scale up to.
    desiredCapacity: 0 # Customizable. The initial amount of nodes to have live.
    iam: 
      withAddonPolicies: 
        autoScaler: true # Required.
        imageBuilder: true # Required.
    privateNetworking: true # Required.
    taints: 
      nvidia.com/gpu: "present:NoSchedule" # Required.
    labels:
      accelerator: 'nvidia' # Required.
    tags:
      k8s.io/cluster-autoscaler/enabled: 'true' # Required.
      k8s.io/cluster-autoscaler/node-template/taint/nvidia.com/gpu: 'present:NoSchedule' # Required.
      k8s.io/cluster-autoscaler/node-template/label/accelerator: 'nvidia' # Required.
      

# Use custom labels

You can label a node group and use that label as part of a compute template to dictate with jobs can run using the nodes.

The label must be added when creating the node group.

The general rule is as follows:

labels:
  key: "value"
tags:
  k8s.io/cluster-autoscaler/node-template/label/key: 'value'  

For example, if you wanted to label a node group with compute_type:

labels:
  compute_type: 'training'

You must add the corresponding autoscaler tag:

tags:
  k8s.io/cluster-autoscaler/node-template/label/compute_type: 'training'

You could then create a compute template and in Labels, write compute_type='training'

# Create the node Pools using the YAML file

To add the node pools to your cluster based on your prepared nodes.yaml file, use the following command:

eksctl create nodegroup --config-file=nodes.yaml

# Enable Autoscaling to 0 Nodes

By default, EKS autoscaling does not allow node pools to scale down to 0 nodes. This means you must always have at least one live node and are always paying for compute. It is however possible to enable support for scaling a node pool to 0 nodes, so that you can set minSize: 0 within your yaml.

To alter the autoscaler to support a node pool with minSize: 0, run the following command:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

# Optimize the Autoscaler

We will now make a few slight changes to the autoscaling protocols to best optimize it for use with cnvrg.

# Disable safe-to-evict

Disable safe-to-evict with the following command:

kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"

# Change the behavior of the autoscaler

We will now update the configuration of the autoscaler. Use following command to open your terminal editor to update the configuration:

kubectl -n kube-system edit deployment.apps/cluster-autoscaler

Now:

  1. Add the the following two flags to the list under - command:
     - --balance-similar-node-groups
     - --skip-nodes-with-system-pods=false
    
  2. (Optional) You can add the following flag to change how long empty nodes stay live before they are scaled down. Set it to the desired amount of minutes. The default value is 10:
    - --scale-down-unneeded-time=15m
    
  3. Update the line - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>, replacing <YOUR CLUSTER NAME> with the name of your cluster.
  4. Finish editing the file by typing :wq.

TIP

The final configuration should look similar to this. It may also have the scale-down-unneeded-time if you added it:

spec:
      containers:
      - command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
        - --balance-similar-node-groups
        - --skip-nodes-with-system-pods=false

# Match the autoscaler's version with your Kubernetes version

You must ensure the version of your autoscaler and the version of your Kubernetes match.

First, check which Kubernetes version you cluster is running on. You can find this with the kubectl version command. It will return the client version and server version. Check the output to find the server Major and Minor version. The output will look like this:

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.5", GitCommit:"20c265fef0741dd71a66480e35bd69f18351daea", GitTreeState:"clean", BuildDate:"2019-10-15T19:16:51Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.11", GitCommit:"ec831747a3a5896dbdf53f259eafea2a2595217c", GitTreeState:"clean", BuildDate:"2020-05-29T19:56:10Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}

In this example, the version of Kubernetes my cluster is running on is 1.15.

Now, go to this page and find the latest release for your major and minor version of Kubernetes. In this example, it is 1.15.7.

Finally, run the following command, customized with the latest release of the autoscaler for your Kubernetes version:

kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:vX.Y.Z

For our example, it would be:

kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.15.7

Examples for other versions of Kubernetes:

kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.16.6
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.17.3
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.18.2
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.19.0

WARNING

The above example commands may not be updated. Ensure you check here to find the latest release for your version of Kubernetes.

# Conclusion

The node pools will now have been added to your cluster. You can add more by following the relevant instructions. If you already have deployed cnvrg to the cluster, or added the cluster as a compute resource inside cnvrg, you will not need to do any more setup and the node pools will immediately be usable by cnvrg.

Last Updated: 9/6/2020, 10:44:09 AM