krawl the cluster for errors and events

June 2, 2020

When you work on kubernetes clusters, it isn’t the tools natural complexity you are more worried about. It is actually what comes along with it. The trouble to run multiple commands to find out the issue in development phase of the application.

Of-course, you can use aliases to speed up a bit and skip typing the complete command yourself. You still have to find the exact cause of the issue and it will take time, no matter what, unless an obvious error happens.

Lets assume, you deployed an application and you see no pods are yet spun. How would you go with this issue?

You listed pods for your application.

kubectl get pods -n myapp_namespace
You listed deployment object for your application, here you see the deployment has not scaled up.

kubectl get deploy -n myapp_namespace
You listed and described replica-set and then you saw the actual issue was the replicas cant be scaled because of no nodes matches the constraint.

kubectl get rs -n myapp_namespace

kubectl describe rs myapp_rs -n myapp_namespace

This kind of issues could be very easy to solve but they take time to figure out that simple cause which could have saved a lot of time.

To mitigate this problem I wrote a simple bash utility that scans through each namespaces and find all pods throwing ‘Error’ and erroneous events of some sorts to ease the troubleshooting.

Below is an excerpt of the krawl’s output that shows how it behaves when it encounters issues.

krawl

One of the extended usage of this script is, it can be used as kubernetes plugin as described by the community documentation here.