←back to thread

264 points davidgomes | 1 comments | | HN request time: 0s | source
Show context
noirscape ◴[] No.41877487[source]
Because the actual process of upgrading Postgres is terrible.

I say this as someone who absolutely loves using it, but the actual process of upgrading Postgres is something that takes significant downtime, is error-prone and you're often better off just dumping all the databases and reimporting them in a new folder. (A good idea in general since it'll vacuum and compact indexes as well if you do it, combining a few maintenance routines in your upgrade path.)

It requires having the previous version of Postgres installed, something which can mess with a number of distro policies (not to mention docker, which is the most popular way to deploy software that will typically rely on Postgres), and unlike most software with that issue, Postgres is software you want to be managed by your distro.

Therefore, most people only upgrade by necessity - when their distro forces the upgrade or the version they're using reaches EOL.

replies(11): >>41877841 #>>41877921 #>>41877992 #>>41878101 #>>41878462 #>>41878670 #>>41879013 #>>41879161 #>>41879191 #>>41879259 #>>41879567 #
MichaelZuo ◴[] No.41877841[source]
So the real question is, why is the upgrade process so incompetently designed, and why has no one fixed this?
replies(5): >>41877898 #>>41877902 #>>41877926 #>>41878252 #>>41878442 #
phil21 ◴[] No.41877902[source]
My opinion is Postgres was designed by software developers for software developers. The split on “which relational database to use” in my career has almost always been perfectly split between SWE vehemently demanding pgsql for the feature set, and the sysadmins having to support maintenance and production availability preferring MySQL.

One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level. Apparently us lowly sysadmins may have had a point after all.

This is a bit tongue in cheek but really not far from my lived reality. When the focus is on features and “correctness” at the near total expense of sane systems tooling folks can develop some myopia on the subject. So many arguments with devs on my teams over this subject that were utterly horrified to find we were running MySQL for a given service.

Open source projects tend to fix the pain points its contributors experience, and I assume there were not too many contributors wanting to deal with the boring work of making administration and easy task - it’s thankless “sideways” work that won’t result in many accolades or personal satisfaction for most SWEs.

The end users are almost always developers, most of whose experiences in production entail either the equivalent of a docker container level scale system, or are simply given a connection string and the rest is a black box to them. Under those contexts I’d personally prefer Postgres as well and it wouldn’t even be close. When you get into backups, clustering, upgrades, and high availability under extreme load? IMO the story falls apart real fast.

replies(5): >>41878214 #>>41878540 #>>41878590 #>>41878653 #>>41881690 #
sgarland ◴[] No.41878653[source]
> The split on “which relational database to use” in my career has almost always been perfectly split between SWE vehemently demanding pgsql for the feature set

I’ve seen this as well, but when pressed, none of them could articulate what part of its feature set they actually needed to use.

> One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level.

Are you (or your devs, rather) actually running your own DBs? If so, respect. My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks, at which point they might ask for help, or they might just make it bigger.

replies(3): >>41878860 #>>41879564 #>>41879577 #
dalyons ◴[] No.41878860[source]
> My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks, at which point they might ask for help, or they might just make it bigger.

Yep that’s exactly what I’ve seen too :). I still overall prefer this distributed database model - yes you spend more and people make mistakes (and learn). But if you can afford it you get higher velocity and more incentive aligned ownership than the old central gate keeping DBA team model.

replies(2): >>41878987 #>>41883630 #
ttfkam ◴[] No.41883630{3}[source]
> the old central gate keeping DBA team model

I have mixed feelings about this. On the one hand I agree that ownership should be shared. On the other, app developers really don't consider their data structures as carefully in SQL as they do in-memory. It's odd. The right data structure matters more than a good algorithm since algorithms are easier to change. Once you settle on a list vs a set vs a queue, you're stuck once code is built around it.

The same is doubly true for the database schema. Lack of planning and knowledge of expected access patterns can turn an otherwise fast database to mud in no time flat. Once your data is in there, changing the schema is exponentially harder.

"I’m a huge proponent of designing your code around the data, rather than the other way around, and I think it’s one of the reasons git has been fairly successful… I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships." – Linus Torvalds (2006)

What is your database but a bunch of data structures and relationships? I get why the gatekeeping occurred. I don't agree with it, but I understand it. Far too many folks consider expertise in data stores to be optional as developers.

replies(2): >>41887467 #>>41888291 #
1. dalyons ◴[] No.41888291{4}[source]
I agree but also I think I could be clearer about the key advantage of the distributed many dbs model… it’s that I don’t have to care if people are good at anything that you said :). If teams want to do dumb things with their schema, fine, that’s on them and they’ll have to deal with the consequences. If it matters, they’ll learn and get better. If they want to buy their way out of it with their latency and infra budget instead, that’s honestly fine too.

With many smaller databases owned separately, the blast radius of bad db decisions is small/local.