Create an EKS cluster for use with an MLOps platform
Welcome to the Create an EKS Cluster guide. This how-to guide will take you through the steps of creating an AWS Elastic Kubernetes Service (EKS) cluster with an appropriate configuration for deploying an MLOps platform such as Kubeflow.
- Local machine with Ubuntu 22.04 or later
- An AWS account (How to create an AWS account)
- Install and set up AWS CLI
- Deploy EKS cluster
- Clean up resources (if needed)
For example, you can use IAM user credentials. If you decide to follow this path, the IAM user created should have these minimum IAM policies for
eksctl to work. For more details on this, see the eksctl documentation
SSH Key pair
This is not a hard requirement, but you’ll usually want to SSH into the instances created by this tutorial for debugging etc. For this, you will need a key pair in the region they will be created. Here is the official documentation for this. You can either create a new key pair or import one you already have.
eksctl, see here for instructions.
kubectl, see here for instructions. You should have no problem following the guide with any version of
kubectl but note that we are using version
1.24.x, since latest Kubernetes version supported by Kubeflow is
For the purpose of making the process of creating an EKS cluster as simple as possible, we will use a
Do not forget that this deployment will incur charges for every hour the cluster is running.
Clone the following repository containing the EKS
git clone https://github.com/canonical/kubeflow-examples.git cd kubeflow-examples/eks-cluster-setup
Before proceeding with deployment, you may need to configure some of the fields in the
cluster.yaml file in the above directory. This file was created with minimum requirements in mind for deploying Charmed Kubeflow/MLFlow.
region: The cluster will deployed by default to
eu-central-1zone. Feel free to edit
availabilityZonesaccording to your needs.
ssh key: As mentioned above, edit
managedNodeGroups.ssh.publicKeyNamefield with your key pair name in order to be able to SSH into the new EC2 instances.
instance type: The cluster will be deployed with EC2 instances of type
t2.2xlargefor worker nodes, according to the
managedNodeGroups.instanceTypefield. This type should be sufficient for an MLOps platform but it has been observed that in the case of MLFlow and Kubeflow integration,
t3.2xlargeis required due to higher network capabilities. See here for more details on instance types.
k8s version: This cluster will use Kubernetes version
1.24. This field was set according to Charmed Kubeflow requirements but feel free to edit if you 're not deploying Kubeflow.
worker nodes: This cluster will have 2 worker nodes. Feel free to edit the
managedNodeGroupsaccording to your needs.
volume size: Each worker node will have gp2/gp3 disk of size 100Gb. Feel free to edit
Now deploy the cluster with following command:
eksctl create cluster -f cluster.yaml
This will take some time (approximately 20 minutes). The command will create kubernetes cluster of version
1.24 with two worker nodes, where each node will have 100GBs of disk space. It will also create the Amazon EBS CSI driver IAM role which EC2 instances will use to manage EBS volumes and will add the Amazon EBS CSI add-on to the cluster. Lastly, it will also create a storage class in your cluster which is needed for deploying an MLOps platform.
Verify kubectl access to cluster
You may check your access to the cluster by running command below which should return a list of two nodes.
kubectl get nodes
In case the
eksctl create cluster command completed successfully without errors but
kubectl doesn’t return the expected nodes, there is chance that your kube config file may not be up-to-date so see here for instructions on updating it. Normally you shouldn’t have to since
eksctl takes care of this.
here for deletion instructions. Keep in mind that the above procedure does not always delete the volumes that have been created during the cluster deployment, so if present and they do not contain any data you would like to keep, proceed to delete them manually.
Last updated 11 days ago.