Deploy Charmed Kubeflow to EKS
Welcome to the Deploy Charmed Kubeflow to EKS guide. This how-to guide will take you through the steps of deploying Kubeflow to an AWS Elastic Kubernetes Service (EKS) cluster. From an architectural point of view, we will spin up an EKS cluster on AWS cloud using eksctl
on our local machine. Then with kubectl
and juju
still on our local machine, we will interact with the cluster to deploy Kubeflow there.
Requirements:
- Local machine with Ubuntu 22.04 or later
- An AWS account (How to create an AWS account)
Content
Deploy EKS cluster
See here for a complete guide on how to do exactly that.
Set up Juju
Set upjuju
on your local machine to access the remote Kubernetes cloud.
- Install
juju
. We use
sudo snap install juju --classic --channel=2.9/stable
- Add your EKS cluster as a cloud to Juju (
kubeflow
cloud name is optional).
juju add-k8s kubeflow --client
- Bootstrap a Juju controller (
kubeflow-controller
controller’s name is optional).
juju bootstrap --no-gui kubeflow kubeflow-controller
- Add a Juju model (
kubeflow
name here is mandatory).
juju add-model kubeflow
- Verify that namespace
kubeflow
exists
kubectl get ns
Deploy Kubeflow bundle
- Deploy Charmed Kubeflow bundle with the following command. Note that we are using
edge
instead ofstable
due to a known issue withmysql-k8s
charm on EKS (and Charmed Kubernetes). The issue has been fixed but the fixed version hasn’t been published yet tostable
.
juju deploy kubeflow --channel=1.7/edge --trust
- Wait until all charms are in green/active state. You can check the state of the charms with following command. In case you face any issues, refer to the Known issues section below. Keep in mind that
oidc-gatekeeper
will not have anActive
status until we configure it as shown in next steps.
juju status --watch 5s --relations
- Make Kubeflow dashboard accessible by configuring its public URL to be the same as the LoadBalancer’s DNS record.
PUBLIC_URL="http://$(kubectl -n kubeflow get svc istio-ingressgateway-workload -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')"
echo PUBLIC_URL: $PUBLIC_URL
juju config dex-auth public-url=$PUBLIC_URL
juju config oidc-gatekeeper public-url=$PUBLIC_URL
- Configure Dex-auth credentials. Feel free to use a different (more secure!) password if you wish.
juju config dex-auth static-username=user@example.com
juju config dex-auth static-password=user
- Navigate to the PUBLIC_URL printed above to access Kubeflow dashboard. You should first see the Dex login screen. Once logged in with the credentials set above, you should now see the Kubeflow “Welcome” page.
Known issues
kfp-api: Workload failed health check
An issue you might have is the kfp-api
component being stuck with a status of maintenance
and a message “Workload failed health check”. You can verify the workload state by running the following:
juju ssh kfp-api/0 "PEBBLE_SOCKET=/charm/containers/ml-pipeline-api-server/pebble.socket /charm/bin/pebble services"
Service Startup Current Since
ml-pipeline-api-server enabled inactive -
If the service is inactive like shown above, you might need to manually start it:
juju ssh kfp-api/0 "PEBBLE_SOCKET=/charm/containers/ml-pipeline-api-server/pebble.socket /charm/bin/pebble replan"
This is a known issue, see kfp-api GitHub issue for more info.
Advise here in case you come up with issues that haven’t been observed though on EKS.
Clean up resources
For EKS clean up, refer to the guide mentioned here. In order to clean up juju, run the following:
juju unregister kubeflow-controller
juju remove-cloud kubeflow
Last updated 11 days ago.