krawl the cluster for errors and events

When you work on kubernetes clusters, it isn’t the tools natural complexity you are more worried about. It is actually what comes along with it. The trouble to run multiple commands to find out the issue in development phase of the application.

Of-course, you can use aliases to speed up a bit and skip typing the complete command yourself. You still have to find the exact cause of the issue and it will take time, no matter what, unless an obvious error happens.

Lets assume, you deployed an application and you see no pods are yet spun. How would you go with this issue?

  1. You listed pods for your application.

    kubectl get pods -n myapp_namespace

  2. You listed deployment object for your application, here you see the deployment has not scaled up.

    kubectl get deploy -n myapp_namespace

  3. You listed and described replica-set and then you saw the actual issue was the replicas cant be scaled because of no nodes matches the constraint.

    kubectl get rs -n myapp_namespace

    kubectl describe rs myapp_rs -n myapp_namespace

This kind of issues could be very easy to solve but they take time to figure out that simple cause which could have saved a lot of time.

To mitigate this problem I wrote a simple bash utility that scans through each namespaces and find all pods throwing ‘Error’ and erroneous events of some sorts to ease the troubleshooting.

Below is an excerpt of the krawl’s output that shows how it behaves when it encounters issues.

krawl

One of the extended usage of this script is, it can be used as kubernetes plugin as described by the community documentation here.

Of-course, you can use aliases to speed up a bit and skip typing the complete command yourself. You still have to find the exact cause of the issue and it will take time, no matter what, unless an obvious error happens.

Lets assume, you deployed an application and you see no pods are yet spun. How would you go with this issue?

  1. You listed pods for your application.

    kubectl get pods -n myapp_namespace

  2. You listed deployment object for your application, here you see the deployment has not scaled up.

    kubectl get deploy -n myapp_namespace

  3. You listed and described replica-set and then you saw the actual issue was the replicas cant be scaled because of no nodes matches the constraint.

    kubectl get rs -n myapp_namespace

    kubectl describe rs myapp_rs -n myapp_namespace

This kind of issues could be very easy to solve but they take time to figure out that simple cause which could have saved a lot of time.

To mitigate this problem I wrote a simple bash utility that scans through each namespaces and find all pods throwing ‘Error’ and erroneous events of some sorts to ease the troubleshooting.

Below is an excerpt of the krawl’s output that shows how it behaves when it encounters issues.

krawl

One of the extended usage of this script is, it can be used as kubernetes plugin as described by the community documentation here.

An elaborated article on the content of this script os also published at opensource.com. The original script can be found at krawl