profile for Gajendra D Ambi on Stack Exchange, a network of free, community-driven Q&A sites

Friday, October 16, 2020

Ceph on kubernetes by rook (rook-ceph) fails with its container doing crashloopbackoff

 So all of a sudden 1/3 ceph nodes became not ready in the k8s and it started evicting the pods including of course the ceph pods which need to access the local storage which lead to the ceph storage malfuntion. Tried all from github issues of rook-ceph helm chart but no go. Finally installed the ceph-toolbox on k8s and ran "ceph status" and that triggered the auto repair tasks on ceph and fixed the ceph pods and thus everything came back online. Strange. I hope rook-ceph also starts offering both filesystems from single deployment on k8s too.