profile for Gajendra D Ambi on Stack Exchange, a network of free, community-driven Q&A sites

Friday, October 16, 2020

Ceph on kubernetes by rook (rook-ceph) fails with its container doing crashloopbackoff

 So all of a sudden 1/3 ceph nodes became not ready in the k8s and it started evicting the pods including of course the ceph pods which need to access the local storage which lead to the ceph storage malfuntion. Tried all from github issues of rook-ceph helm chart but no go. Finally installed the ceph-toolbox on k8s and ran "ceph status" and that triggered the auto repair tasks on ceph and fixed the ceph pods and thus everything came back online. Strange. I hope rook-ceph also starts offering both filesystems from single deployment on k8s too.

Thursday, July 9, 2020

Google is wrong about managing multiple kubernetes configs

So here is the documentation about how you should manager multiple kubernetes clusters from your workstation. I know that now it is mostly CNCF which guidelines the k8s but still google 'baba' is the father of it (or may be mother or both).
I was kind of confused when I read this first, not just about the technicality but when they suggest that we should *merge* or have one file for all. In the world of compartmentalization, seggregation of apps or services, microservices, namespaces where application or service distancing is as much necessary as social distancing during the pandemic (read covic19) to minimize the spread, kubernetes options or suggestions to have them all in one file is simply baffling. The dis advantages of official documented methods is

  1. It is one file for all, If one file is corrupt then your access to all your clusters is in danger.
  2. A misconfigured master kuber config can lead to right operations performed on wrong cluster, leading to disaster.
  3. switching between clusters (contexts) is a loooong command
  4. Messy
  5. Hard to keep a track of contexts once the clusters you manage grows
Solution: 20 year old solution. Yes. Linux/Unix has a built in solution.
In my daily operations I use alias.
Take a look at my  ~/.bashrc file.

alias kl=kubectl
alias skl='sudo kubectl'
alias cici='export KUBECONFIG=/home/batman/.kube/configs/ramci'
alias cist='export KUBECONFIG=/home/batman/.kube/configs/ramstaging'
alias k8ra='export KUBECONFIG=/home/batman/.kube/configs/kube_config_rancher-cluster.yaml'
alias k8cs='export KUBECONFIG=/home/batman/.kube/configs/kube_config_k8s-devel-cluster.yaml'
When I type k8cs in my terminal, I switch to a 'k8s cluster sanjose', When I type k8st 'k8s cluster staging'. you can even label then as
k8+2 letter permutation among the 26 alphabets.
With just 3 or 4 keystrokes i can switch between clusters. All kubeconfig files are separate from each other and when a colleague wants access or a kubeconfig file of a particular cluster, I can just pass it on without having to worry about passing on a file which contains access to all clusters. I sometimes also append a command to prints out the cluster-info after it switches to that cluster. 

Tuesday, June 23, 2020

Add kubernetes cluster to gitlab

  1. Create a serviceaccount for namespace kube-system (let us say gitlab is the name of the serviceaccount)
  2. kubectl -n kube-system get sa gitlab -o yaml
  3. You will find the token from the step 2. Let us say the name of that token is gitlab-token-5g769 kubectl -n kube-system get secret gitlab-token-5g769 -o yaml
  4. Decode the base 64 ca.crt value and then convert it to ca.pem, you can use openssl command line or some online converter
  5. Decode the token too using base 64
  6. Now offer the ca and pem values to gitlab during adding of k8s.

Wednesday, May 20, 2020

Preparing your kubernetes nodes

I often create, destroy, rebuild k8s nodes and I have noticed a few things which are a must. I personally use RKE to build, rebuild, reconfigure, configure these clusters but irrespective of that the list should apply to you too.
note: replace username ambi with whatever you like. The following list if for ubuntu 18+
1. create a new user called ambi and add him to sudo group.
2. log in as this user, install docker-ce by following the docker's official documentation. Add above user to docker group.
3. Make sure /etc/docker/certs.d/<>/<>.crt is present on all nodes
4. sudo chown <newly created user from step 1> /etc/docker
5. If you have storageclasses then make sure to have a default one.
kubectl patch storageclass <mystorageclassname> -p '{"metadata": {"annotations":{"":"true"}}}'

Thursday, March 5, 2020

MetalLB load balancer for your Rancher 2 Kubernetes cluster

It is fairly simple to setup a kubernetes baremetal cluster with rancher 2 which comes with all the bells and whistles. I however had some touch time in figuring out why I couldnt get the metallb load balaner working.
First setup your rancher 2 baremetal cluster.
Then go to
After this deploy the metallb on your kubernetes cluster as per
then deploy a configmap with an ip pool.

kubectl run nginx --image nginx
kubectl expose deployment nginx --type=LoadBalancer --name=nginx-service

Now you will see an exteranl IP from the metallb ip pool. You delete the service and metallb will reclaim it.

Wednesday, January 29, 2020

Clustering RabbitMQ on ubuntu 18.x

I am exploring a highly available, multi site, multi network rabbitmq server for what it does best.
3 VMs
on KVM+kimchi virtualization setup.

sudo vim /etc/hosts
and update it with the details of all 3 nodes

192.x.y.z ub1
192.x.y.z ub2
192.x.y.z ub3

Install and Setup RabbitMQ on all

Copy paste the following on all 3 nodes.

sudo apt update
sudo apt upgrade
sudo apt install rabbitmq-server -y

sudo systemctl start rabbitmq-server
sudo systemctl enable rabbitmq-server


sudo rabbitmq-plugins enable rabbitmq_management
sudo systemctl restart rabbitmq-server


sudo ufw allow ssh
sudo ufw enable
sudo ufw allow 5672,15672,4369,25672/tcp
sudo ufw status


So the erlang cookie used by rabbitmq should be same on all hosts/nodes. We can copy it from node1 to others via scp but we need to change the permissions on other nodes for this cookie so that you don't get permission denied error. Run the below command on ub2 & ub3.

sudo chmod 777 /var/lib/rabbitmq/.erlang.cookie

Run the following on ub1

scp /var/lib/rabbitmq/.erlang.cookie root@ub2:/var/lib/rabbitmq/
scp /var/lib/rabbitmq/.erlang.cookie root@ub3:/var/lib/rabbitmq/

Change back the permission on the cookie to be accessible only by the owner. Run the following on ub2,ub3.

sudo chmod 600 /var/lib/rabbitmq/.erlang.cookie
Again on ub2,ub3

sudo systemctl restart rabbitmq-server
sudo rabbitmqctl stop_app
sudo rabbitmqctl join_cluster rabbit@ub1
sudo rabbitmqctl start_app

User Setup

Run the following on ub1. This will create an admin user and deletes the guest user.

sudo rabbitmqctl add_user admin admin
sudo rabbitmqctl set_user_tags admin administrator
sudo rabbitmqctl set_permissions -p / admin ".*" ".*" ".*"
sudo rabbitmqctl delete_user guest

Run the following on ub2,ub3 to confirm that the clustering is working.

sudo rabbitmqctl cluster_status
sudo rabbitmqctl list_users

HA[Queue Mirroring]

sudo rabbitmqctl set_policy ha-all "." '{"ha-mode":"all"}'