Welcome to the Create an EKS Cluster guide. This how-to guide will take you through the steps of creating an AWS Elastic Kubernetes Service (EKS) cluster with an appropriate configuration for deploying an MLOps platform such as Kubeflow.
Requirements:
- Local machine with Ubuntu 22.04 or later
- An AWS account (How to create an AWS account)
Steps
- Install and set up AWS CLI
- Install
eksctl
- Install
kubectl
- Deploy EKS cluster
- Clean up resources (if needed)
Install and set up AWS CLI
First, install AWS CLI on your local machine and then set it up. You can use any of the authentication ways available for AWS CLI.
For example, you can use IAM user credentials. If you decide to follow this path, the IAM user created should have these minimum IAM policies for eksctl
to work. For more details on this, see the eksctl documentation
SSH Key pair
This is not a hard requirement, but you’ll usually want to SSH into the instances created by this tutorial for debugging etc. For this, you will need a key pair in the region they will be created. Here is the official documentation for this. You can either create a new key pair or import one you already have.
Install eksctl
Install eksctl
, see here for instructions.
Install kubectl
Install kubectl
, see here for instructions. You should have no problem following the guide with any version of kubectl
but note that we are using version 1.24.x
, since latest Kubernetes version supported by Kubeflow is 1.24
.
Deploy EKS cluster
For the purpose of making the process of creating an EKS cluster as simple as possible, we will use a yaml
file.
Do not forget that this deployment will incur charges for every hour the cluster is running.
Clone the following repository containing the EKS yaml
file.
git clone https://github.com/canonical/kubeflow-examples.git
cd kubeflow-examples/eks-cluster-setup
Before proceeding with deployment, you may need to configure some of the fields in the cluster.yaml
file in the above directory. This file was created with minimum requirements in mind for deploying Charmed Kubeflow/MLFlow.
-
region: The cluster will deployed by default to
eu-central-1
zone. Feel free to editmetadata.region
andavailabilityZones
according to your needs. -
ssh key: As mentioned above, edit
managedNodeGroups[0].ssh.publicKeyName
field with your key pair name in order to be able to SSH into the new EC2 instances. -
instance type: The cluster will be deployed with EC2 instances of type
t2.2xlarge
for worker nodes, according to themanagedNodeGroups[0].instanceType
field. This type should be sufficient for an MLOps platform but it has been observed that in the case of MLFlow and Kubeflow integration,t3.2xlarge
is required due to higher network capabilities. See here for more details on instance types. -
k8s version: This cluster will use Kubernetes version
1.24
by default. Make sure to edit this according to your needs. For Charmed Kubeflow, see the supported versions and use the lastest supported version available according to the bundle you 're deploying. -
worker nodes: This cluster will have 2 worker nodes. Feel free to edit the
maxSize
andminSize
undermanagedNodeGroups[0]
according to your needs. -
volume size: Each worker node will have gp2/gp3 disk of size 100Gb. Feel free to edit
managedNodeGroups[0].volumeSize
.
Now deploy the cluster with following command:
eksctl create cluster -f cluster.yaml
This will take some time (approximately 20 minutes). The command will create kubernetes cluster of version 1.24
with two worker nodes, where each node will have 100GBs of disk space. It will also create the Amazon EBS CSI driver IAM role which EC2 instances will use to manage EBS volumes and will add the Amazon EBS CSI add-on to the cluster. Lastly, it will also create a storage class in your cluster which is needed for deploying an MLOps platform.
Verify kubectl access to cluster
You may check your access to the cluster by running command below which should return a list of two nodes.
kubectl get nodes
Troubleshoot kubectl
In case the eksctl create cluster
command completed successfully without errors but kubectl
doesn’t return the expected nodes, there is chance that your kube config file may not be up-to-date so see here for instructions on updating it. Normally you shouldn’t have to since eksctl
takes care of this.
Clean up resources
If you no longer need the created EKS cluster, refer here for deletion instructions. Keep in mind that the above procedure does not always delete the volumes that have been created during the cluster deployment, so if present and they do not contain any data you would like to keep, proceed to delete them manually.Last updated 9 months ago.