profile for Gajendra D Ambi on Stack Exchange, a network of free, community-driven Q&A sites

Friday, May 26, 2023

Postgresql on cloud, self or hosted, if self then monolithic or k8s? which k8s operator?

 So I have been planning to move my django sites to kubernetes to get that automated horizontal and vertical scaling, failover, high availability, fault tolerance and other good stuff. Database on kubernetes has been a very huge topic of , to do or not to do...yes a year back, I would not have recommended it. Operators were just kicking in. Now many seem to have matured everyone has their own querks and problems.

Why not hosted?

There are many DBaaS. All cloud providers provide one. Many provide only DBaaS such as elephantSQL. What I am afraid about is freedom, loss of freedom. We all know what happened to parler.com, they got just booted off their cloud account, data gone, applications gone. One day their account just got locked out. So all these DBaaS providers do not offer replication, backup to intracloud target (that is, having a standby node on another competing cloud provider, or your own on prem server or an object storage from some other provider or a NAS server at home etc.,). Sure they provide headache free, automated, regular backup, PITR - point in time recovery, base backup, incremental backup, differential backup, standy node etc., You dont have to have a team of DB engineers who maintain it and do all these day 2 operations. You remember what happened to gitlab dont you? Engineer accidentally deleted their entire primary postgresql database. Yes, they did and they found out that the backups that they had were no good.  So you are outsourcing all your headache to someone else. This is good, but not enough.

I want to be able to have my backup or replication or DR solution on some other site, some other cloud provider or a target of my choice. 

Self hosted: Monolithic or k8s

If you go monolithic then I highly recomment citus. It is an extension, which allows you to scale, shard at will and it is a clustering solution so you get the HA. There are of course the OG solutions like perconi, patroni, EDB, many paid, free and opensource solutions. You do however have to learn all these implentations, do a POC in house, verify all DR options, test HA, FT before going for any of them.

Self hosted: k8s operator

I considered and evaluated

crunchydata

zalando

kubedb

stackgres

percona 

but went with cnpg aka cloud native postgresql operator. Why? because of the first 2 words. Cloud native. Everyone else is trying to adopt what has been tested in the monolithic world to kubernetes world to get that autoscaling, inbuild HA, FT, auto recovery options from k8s but they are built for cloud or k8s, they are adopted to cloud/k8s. cnpg is built for k8s. Also their documentation was a lot better than any of the top above,udpated and their example deployment manifests per different options, just work. 

So wish me luck. I hope all my django apps stay afloat and sail smoothly because their DB is sailing smoothly.