Explore basic operations

It’s always fun to break stuff, right? In this tutorial, we’re going to intentionally break Charmed Kubeflow in a few simple ways. Then we’ll show you how to debug the failures. Most importantly, we’ll fix everything we break, so that we have a working Kubeflow in the end.

This tutorial is typically done as part of our overall Charmed Kubeflow Learning Path. But if not, we’ll assume you have Kubeflow version 1.7 deployed with Microk8s, and you have access to the Kubeflow UI e.g. at http://10.64.140.43.nip.io.

Let’s get hacking!

Contents:

Tinkering with the public-url

Alright, mischief-makers, let’s start by playing around with the public-url. We’re going to point it somewhere it shouldn’t be. Run this command in your terminal:

juju config dex-auth public-url=http://BAD_VALID_URL

Now, take a deep breath and visit the Kubeflow UI. Try to log in. Oops! Login is shattered. But don’t panic; we did this on purpose. Remember, we’re in the business of breaking things today!

Why did this happen? Well, we just told our system to look for authentication in a place that doesn’t exist. It’s like sending someone to a fake address and then wondering why they didn’t show up for dinner.

Inspect the public URL config option as follows:

juju config dex-auth public-url

What do you expect to see here? See if you can figure out what the response will be, then check the result below to confirm your suspicions.

Result

The config option will be the bad URL we inserted earlier: http://BAD_VALID_URL.

Think you can fix it yourself? Give it a try, and then check the fix below to see if you got it.

The fix...

Run:

juju config dex-auth public-url=http://10.64.140.43.nip.io

Then wait for the configuration change to propage. Use juju status to keep an eye on things!

Cutting Relations

Now, let’s break one of our relations. First off, check out the existing relations by running this command:

juju status --relations

See all those connections? We’re going to cut one of them out of our system altogether. Run this command:

juju remove-relation kfp-db:database kfp-api:relational-db

Now, check the status: juju status.

You’ll see “executing” in the Agent column – that’s Juju’s way of saying, “Hold on, I’m processing that reckless thing you just did.”

Keep an eye on juju status until the charm stops “executing”. Eventually, our catastrophic change will send the Workload into a blocked state: eek!

Go to the Kubeflow UI and dare to navigate to the pipelines tab. Yikes! Another error. But again, that’s what we’re here for.

Run juju status again. On the same row as the charm we unceremonioiusly blocked earlier, you’ll see a message: “Please add required database relation: eg. relational-db”. It’s like the system’s way of saying, “Hey, I noticed you unplugged something. Mind plugging it back in?”

Let’s Patch Things Up.

Think you can figure out the fix yourself? Go on, give it a try! Check out the solution below when you’re ready.

Solution

Heeding Juju’s advice, run:

juju relate kfp-db:database kfp-api:relational-db

You’ll see kfp-db is “executing”. It’s working hard to mend the bridges we burned. Give it about 5 minutes, and voilà! The UI should be back to its shiny self.

Troubleshooting

You might encounter an error in the UI like this:

Error 1045: Access denied for user 'relation-18'@'kfp-api-0.kfp-api-endpoints.kubeflow.svc.cluster.local' (using password: YES)

Quick Fix: Let’s give it a gentle nudge:

microk8s kubectl delete pod kfp-api-0 -nkubeflow

Once Juju restarts the pod, all should be well.

More info available on this GitHub Issue.

Get in Touch

Did you find this tutorial helpful? Painful? Both? We’d love to hear from you. Get in touch with us on Matrix.


Last updated 3 months ago.