Most active commenters
  • ttfkam(11)
  • sgarland(7)
  • jeltz(5)
  • literalAardvark(5)
  • evanelias(5)
  • crabbone(3)
  • paulryanrogers(3)
  • RegnisGnaw(3)
  • Tostino(3)
  • danudey(3)

←back to thread

264 points davidgomes | 101 comments | | HN request time: 1.514s | source | bottom
1. noirscape ◴[] No.41877487[source]
Because the actual process of upgrading Postgres is terrible.

I say this as someone who absolutely loves using it, but the actual process of upgrading Postgres is something that takes significant downtime, is error-prone and you're often better off just dumping all the databases and reimporting them in a new folder. (A good idea in general since it'll vacuum and compact indexes as well if you do it, combining a few maintenance routines in your upgrade path.)

It requires having the previous version of Postgres installed, something which can mess with a number of distro policies (not to mention docker, which is the most popular way to deploy software that will typically rely on Postgres), and unlike most software with that issue, Postgres is software you want to be managed by your distro.

Therefore, most people only upgrade by necessity - when their distro forces the upgrade or the version they're using reaches EOL.

replies(11): >>41877841 #>>41877921 #>>41877992 #>>41878101 #>>41878462 #>>41878670 #>>41879013 #>>41879161 #>>41879191 #>>41879259 #>>41879567 #
2. MichaelZuo ◴[] No.41877841[source]
So the real question is, why is the upgrade process so incompetently designed, and why has no one fixed this?
replies(5): >>41877898 #>>41877902 #>>41877926 #>>41878252 #>>41878442 #
3. throwaway19972 ◴[] No.41877898[source]
There are many answers to this question and few are as satisfying as the question implies.
4. phil21 ◴[] No.41877902[source]
My opinion is Postgres was designed by software developers for software developers. The split on “which relational database to use” in my career has almost always been perfectly split between SWE vehemently demanding pgsql for the feature set, and the sysadmins having to support maintenance and production availability preferring MySQL.

One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level. Apparently us lowly sysadmins may have had a point after all.

This is a bit tongue in cheek but really not far from my lived reality. When the focus is on features and “correctness” at the near total expense of sane systems tooling folks can develop some myopia on the subject. So many arguments with devs on my teams over this subject that were utterly horrified to find we were running MySQL for a given service.

Open source projects tend to fix the pain points its contributors experience, and I assume there were not too many contributors wanting to deal with the boring work of making administration and easy task - it’s thankless “sideways” work that won’t result in many accolades or personal satisfaction for most SWEs.

The end users are almost always developers, most of whose experiences in production entail either the equivalent of a docker container level scale system, or are simply given a connection string and the rest is a black box to them. Under those contexts I’d personally prefer Postgres as well and it wouldn’t even be close. When you get into backups, clustering, upgrades, and high availability under extreme load? IMO the story falls apart real fast.

replies(5): >>41878214 #>>41878540 #>>41878590 #>>41878653 #>>41881690 #
5. tgv ◴[] No.41877921[source]
ADd to that: if it works, don't fix it.
replies(1): >>41879585 #
6. WJW ◴[] No.41877926[source]
The answer to both is probably "because maintainer time is scarce and nobody is interested enough".
7. crabbone ◴[] No.41877992[source]
> Postgres is software you want to be managed by your distro.

Why?

For production systems I generally try to avoid using anything not built internally. When it comes to PostgreSQL, for example, if you want to submit benchmarks, or you want to report bugs, an important or sometimes a mandatory component is to provide compilation flags. Also, seeing how every kind of storage can act in a different way, tuning of higher-level storage programs seems inevitable. Even though PostgreSQL has a lot of configurations that can be done to the compiled program, some (eg. block size) are compile-time that can only be changed by compiling the program.

replies(3): >>41878275 #>>41878277 #>>41880583 #
8. CrimsonRain ◴[] No.41878101[source]
I fully agree. That's why I use this: https://github.com/pgautoupgrade/docker-pgautoupgrade

But obviously this is not suitable for all use cases. I don't know why pg doesn't invest in this sector.

replies(2): >>41881649 #>>41884461 #
9. mxey ◴[] No.41878214{3}[source]
As someone who operates both I much prefer to run pg_upgrade every few years rather than dealing with mysqldump, replication issues and babysitting Orchestrator.
replies(1): >>41878946 #
10. whizzter ◴[] No.41878252[source]
I think one really BIG factor is that built-in logical replication wasn't introduced until PostgreSQL 10 in 2017, before that you only had physical replication for master-slave but iirc that didn't work between versions so doing a "hot" upgrade was more or less impossible without third-party tools iirc.

So even if it's available these days, the amount of people still subjected to upgrades from older version still leaves an impression that it's really bad.

replies(2): >>41878641 #>>41879287 #
11. digitallis42 ◴[] No.41878275[source]
Depends on your scale. If you're a startup or even just a small side service, performance isn't going to be a bottleneck and you/the org wants the thing to be fire and forget, including bug fixes and especially security patches. A distro takes care of all of those generally and makes sure the dependencies have the same care taken.
replies(1): >>41878437 #
12. throw0101c ◴[] No.41878437{3}[source]
> A distro takes care of all of those generally and makes sure the dependencies have the same care taken.

So does using the vendor repos:

* https://apt.postgresql.org/ / https://yum.postgresql.org

* https://dev.mysql.com/downloads/repo/

This way when you upgrade your OS you don't have to worry about suddenly getting a new version (which, in the case of MySQL, may take a long time to convert its database files to the new version format).

replies(4): >>41878593 #>>41879231 #>>41880330 #>>41890604 #
13. jeltz ◴[] No.41878442[source]
To avoid having to slow down development of new PostgreSQL features. Improving upgrades in a way where PostgreSQL does not need to either maintain multiple different versions of parts of the code and/or lock down internal interfaces which now can change freely every major version so they cannot be refactored and improved in the future is not a trivial task, maybe even impossible. Even just the existence of pg_upgrade has to some degree limited what can be done to improve PostgreSQL. Obviously pg_upgrade is worth it, but hindering development even further might not be popular.

The PostgreSQL team simply does not have the resources to do this. At least not without significantly slowing down development of everything else which there is no political will for. Maybe someone will come up with a genius idea which solves this but I am doubtful. Usually there is no free lunch.

Maybe some core dev will correct me, I am quite familiar with the PostgreSQL project but not one of the core devs. :)

replies(1): >>41878826 #
14. jeltz ◴[] No.41878462[source]
While I partially agree the solution is pretty simple to the multiple versions problem: use the official RPM and APT repos.

https://apt.postgresql.org/

https://yum.postgresql.org/

15. jeltz ◴[] No.41878540{3}[source]
That has not been my experience at all. The sysadmins I have worked with have strongly preferred PostgreSQL over running MySQL while the developers have cared, but less so. The reason is that when something goes wrong PostgreSQL is much easier to diagnose. So while some tasks like upgrades take less manual effort with MySQL PostgreSQL is more stable and gives better error messages for when you get paged in the middle of the night.

PostgreSQL has prioritized correctness and stability which while it has made certain features more clunky to use or taken longer time for them to be implemented, when you get paged n the middle of the night you get the time you spent back. PostgreSQL also has spent a lot of time on improving DBA experience so mixed DBA/sysadmins also usually prefer PG.

replies(1): >>41890264 #
16. eru ◴[] No.41878590{3}[source]
Of course, sqlite is even easier on the sys-admins, (but not necessarily the right tool for the job.)
replies(1): >>41878750 #
17. jamespo ◴[] No.41878593{4}[source]
How does that work for python/perl/ruby libs etc?
replies(1): >>41881646 #
18. creshal ◴[] No.41878641{3}[source]
Yeah, everywhere I worked there's usually two clusters of postgres databases

- whatever made it past the postgres 10 hump is on the most recent version and keeps getting migrated

- whatever is still on 9.x is scheduled to remain that way until the product using it will be EOL'd

19. sgarland ◴[] No.41878653{3}[source]
> The split on “which relational database to use” in my career has almost always been perfectly split between SWE vehemently demanding pgsql for the feature set

I’ve seen this as well, but when pressed, none of them could articulate what part of its feature set they actually needed to use.

> One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level.

Are you (or your devs, rather) actually running your own DBs? If so, respect. My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks, at which point they might ask for help, or they might just make it bigger.

replies(3): >>41878860 #>>41879564 #>>41879577 #
20. forinti ◴[] No.41878670[source]
I have always compiled from source so that I can have more than one version in place. The installation takes less than 1GB and is quite easy and fast to compile and I've never had any issues with upgrades. If you forget to compile an extension, you can do it later, copy the .so to the right folder, and you don't even have to restart the cluster.

If you don't want to use pg_upgrade, you can dump from one cluster and pipe directly into another, with no need for a temporary file. It couldn't be easier.

Can't say the same for Oracle. Even applying a patch can ruin several weekends.

replies(2): >>41878915 #>>41880421 #
21. gramakri2 ◴[] No.41878750{4}[source]
How does one backup a sqlite without stopping the app or disrupting service? I couldn't find a simple answer to this
replies(3): >>41878854 #>>41879410 #>>41880876 #
22. paulryanrogers ◴[] No.41878826{3}[source]
What is this holding back? A redo based alternative to MVCC?
replies(1): >>41879243 #
23. tommy_axle ◴[] No.41878854{5}[source]
With sqlite3 -backup There's an online backup api
24. dalyons ◴[] No.41878860{4}[source]
> My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks, at which point they might ask for help, or they might just make it bigger.

Yep that’s exactly what I’ve seen too :). I still overall prefer this distributed database model - yes you spend more and people make mistakes (and learn). But if you can afford it you get higher velocity and more incentive aligned ownership than the old central gate keeping DBA team model.

replies(2): >>41878987 #>>41883630 #
25. literalAardvark ◴[] No.41878915[source]
> It couldn't be easier

If you have a trivial database, yes.

If you have a large, complicated database that takes hours/days to restore and has an HA SLA, very hell no.

replies(2): >>41879032 #>>41879224 #
26. literalAardvark ◴[] No.41878946{4}[source]
just as long as you don't forget to reindex after
replies(2): >>41879466 #>>41880017 #
27. sgarland ◴[] No.41878987{5}[source]
I don’t mind the model IFF the team has interest in learning how to do it correctly. My biggest complaint as both an SRE and now DBRE has been that dev-managed infrastructure inevitably means during an incident that I had nothing to do with, I’ll be paged to fix it anyway. Actually, that’s not the problem; the problem is later when I explain precisely how and why it broke, and how to avoid it in the future, there’s rarely any interest in doing so.

“You have an unpartitioned table with a few billion rows and a UUIDv4 PK. I’m amazed it’s been working as long as it has. You need to change your schema.”

“Or we can just buy a bigger instance.”

“…”

Rinse and repeat six months later. I’m aware this is an organizational problem, but from what I’ve seen, it’s endemic.

Re: incentives, yes, also a problem. Dev teams are generally incentivized by Product, who doesn’t care at all whether or not something is optimal, only that new features are being shipped. I despise this mentality, but it’s not usually the devs fault.

replies(2): >>41879210 #>>41879447 #
28. kardianos ◴[] No.41879013[source]
There are two primary issues with PostgreSQL:

* Upgrades

* Database Protocol

MS SQL Server does better on both of these fronts: Upgrades, when running MS SQL Server in a container, is rebooting the database with the new database engine version. The DBMS sees the old files, upgrades them in place and starts the DB completely seamlessly 100% of the time.

The MS SQL Server protocol, TDS, supports native named parameters, executes the T-SQL PL inline with SQL. TDS also supports inline query cancellations because the protocol is framed, and both T-SQL and TDS supports returning multiple result sets from a single query.

But yes, whenever I see PG docs saying, major versions require an export and re-load, I shudder.

replies(1): >>41880033 #
29. ◴[] No.41879032{3}[source]
30. ktosobcy ◴[] No.41879161[source]
This! I already mentioned this a while back and was basically berated (by someone that seemed like a dev/close to dev) that current setup is just fine and because of postgress complexity (extensions) it has to be done that way... and while I like postgress a lot it's quite annoying that the upgrade is such a terrible experience... :|
replies(1): >>41885532 #
31. thomaslord ◴[] No.41879191[source]
I'll confess - I have a project that uses Heroku's managed Postgres and my preferred upgrade method is to set the maintenance window to the middle of the night, create a backup, and be awake at 1am to make sure that nothing is broken after they force the upgrade. Their auto-upgrade process hasn't failed me so far, but there's no way to manually trigger it.
replies(1): >>41885684 #
32. necovek ◴[] No.41879210{6}[source]
UUID (version does not matter for storage, only for generation and distribution) is basically a 128-bit unsigned int, so a double "word" on 64-bit platforms, and it's natively supported by Postgres since at least 8.3 (earliest version with docs up).

While most versions ensure it's random, there are plenty of indexing algorithms that make searching through that quick and close to O(1), so that should not be the schema problem.

Unless you used a string field, but there is a quick workaround for that with Postgres too (make an index on `UUID(field)`, and look it up by `UUID(value) = UUID(field)`).

That's why both "devops" (DBAs?) and "devs" prefer Postgres over many other databases: you can easily handle some small mistakes in schemas too while you do the full migration in parallel.

replies(1): >>41879419 #
33. forinti ◴[] No.41879224{3}[source]
That's not going to be easy with any database.
replies(3): >>41879571 #>>41881588 #>>41887428 #
34. necovek ◴[] No.41879231{4}[source]
> This way when you upgrade your OS you don't have to worry about suddenly getting a new version (which, in the case of MySQL, may take a long time to convert its database files to the new version format).

At least on Ubuntu (and likely Debian), your existing Postgres version is never dropped until you are ready to manually upgrade yourself.

35. jeltz ◴[] No.41879243{4}[source]
Nope, that is totally unrated. To support upgrade in place without an old version of PostgreSQL:

1. The new version of PostgreSQL would need to able to read all old catalog table formats and migrate them.

2. The new version of PostgreSQL would need to support all old versions of the parse tree to migrate views.

3. Likely a bunch of more things that I do not know of. I for example doubt it is trivial to just read an old catalog without having a fully up and running cluster which supports almost everything in that old cluster. The catalog has TOAST tables and indexes for example.

Right now 1 and 2 are implemented in pg_dump plus by having pg_dump call functions in a running old version of PostgreSQL.

replies(2): >>41879618 #>>41879746 #
36. anymouse123456 ◴[] No.41879259[source]
Also a fan of Postgresql, but compiling and installing this tool is also a complete nightmare.

Every bit of the experience before getting it working, feels like it was designed by an MBA who owns a Postgres hosting service.

Other than that, it's amazing.

37. RegnisGnaw ◴[] No.41879287{3}[source]
There is still major issues with logical replication, mainly limited feature support.
replies(1): >>41888427 #
38. zaphar ◴[] No.41879410{5}[source]
The typical solution is https://litestream.io/. Comparatively easy to setup for most usages of sqlite.
39. sgarland ◴[] No.41879419{7}[source]
Who said this was Postgres? MySQL (with the default InnoDB engine) and MSSQL both are clustering indexes; they store tuples around the PK. For a UUIDv4 PK – or anything else non-k-sortable, for that matter – this results in a massive amount of B+tree bloat from the random inserts.

But sure, let’s talk about Postgres. After all, it stores tuples in a heap, and so is immune to this behavior.

Except that its MVCC implementation means that it has to maintain a Visibility Map [0] to keep track of which pages contain only tuples which are visible to every active transaction. This is nominally used for vacuum decisions, but is also cleverly exploited for use with index-only scans. This poses a problem when referencing non-k-sortable entries, because while the tuples are in a heap, the indices are generally a B+tree. So now we’re back to the problem of massive IO amplification. Even if the VM and index are entirely cached, reads are reads, and they add up.

Then there’s the issue of WAL bloat due to full page writes. tl;dr Postgres pages are nominally 8 KiB, and storage device pages are nominally 4 KiB. To guarantee an atomic write, Postgres writes the entire page for a given tuple for the first write after checkpoint, regardless of how many bytes were altered. Again, non-k-sortable: if your tuples are spread randomly across pages, you now have far more pages to write. This can and does matter for either write-heavy workloads, instances with limited network bandwidth (oh hi small RDS / Aurora), or the worst, both combined.

Re: search complexity, I’m pretty sure B+trees (and B-trees) have O(log n) time complexity, not O(1). Whether or not that’s “close” depends on n, I suppose, but in this scenario I specifically said “billions of rows.”

> That's why both "devops" (DBAs?) and "devs" prefer Postgres over many other databases

I’m a DBRE, and like both MySQL and Postgres. They both have strengths and weaknesses, but you need to deeply understand those – and properly design your schema and query patterns around them – in order to make an informed decision.

[0]: https://www.postgresql.org/docs/current/storage-vm.html

replies(1): >>41883681 #
40. mschuster91 ◴[] No.41879447{6}[source]
> Rinse and repeat six months later. I’m aware this is an organizational problem, but from what I’ve seen, it’s endemic.

Easy enough: almost no one writes SQL queries by hand these days, not for querying the database nor for doing schema upgrades. It's all done by tools - Doctrine in the PHP world for example. And pretty much no one but actual CS graduates knows anything deeper about databases.

Result is, devs are happy enough if they found something that works, and don't want to risk being the one who broke prod because they applied some schema change suggested by their DBA who doesn't know some random thing about the application.

replies(2): >>41880131 #>>41883719 #
41. mxey ◴[] No.41879466{5}[source]
> All failure, rebuild, and reindex cases will be reported by pg_upgrade if they affect your installation; post-upgrade scripts to rebuild tables and indexes will be generated automatically.
replies(1): >>41882680 #
42. ttfkam ◴[] No.41879564{4}[source]
> none of them could articulate what part of its feature set they actually needed to use.

Transactional DDL: migration errors never leave the database in an intermediate/inconsistent state.

Range types + exclusion constraint: just no way to do this in MySQL without introducing a race condition.

Writeable CTEs: creating insert/update/delete pipelines over multiple tables deterministically. Seriously though, the RETURNING clause is something I use all the time both in and out of CTEs.

Filtered aggregates and grouping sets: cleanly get multiple data points for a dashboard in one shot.

Unnest: converting arrays into a set of rows. Inverse of array_agg(...).

Types: arrays, booleans, IP/subnets, UUIDs (without binary(16) hacks), etc.

Materialized views: seriously, how does MySQL not have this yet?

Statement-level triggers: another option from per-row.

Row-level security: setting data visibility based on configurable policies.

I can cite specific use cases I've deployed to production for each of these and more.

replies(2): >>41879932 #>>41882079 #
43. beeboobaa3 ◴[] No.41879567[source]
This has been solved by logical replication a few versions ago.

1. Leave your old database running

2. Start your new database with new version

3. Run logical replication and let it catch up

4. Decide you want to switch over.

5. Using a script: Stop your apps writing to the database (or switch to read only mode), let logical replication catch up one final time, stop logical replication, run a procedure to fix your sequences, and switch your apps over to the new database.

You are now done. Sure, it requires a one-time investment to work out the exact steps needed for your situation, but it's easy to do with minimal downtime.

Of course if you don't care about downtime just shut down your database server, make a backup(!), and then do pg_upgrade.

replies(2): >>41880486 #>>41880522 #
44. Too ◴[] No.41879571{4}[source]
With mongodb in a HA setup you can perform a staggered upgrade of individual nodes in the cluster with no downtime. Very smooth in my experience.
replies(1): >>41887158 #
45. ttfkam ◴[] No.41879577{4}[source]
> My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks

That's a true shame considering how easy it is to make a read replica of any size and then fail over to it as the new primary. Definite skill issues.

replies(1): >>41882144 #
46. chefandy ◴[] No.41879585[source]
And considering how many people use some sort of abstraction layer like an ORM, have databases installed as part of some deployed application, or just do basic CRUD stuff, it probably would have worked on a postgres install from the 90s.

I keep an eye out for vulnerabilities, and that's about it.

47. ttfkam ◴[] No.41879618{5}[source]
It is a PITA, but I've written scripts that pg_dump just the schema, load the schema into the new db with the new version, set up logical replication between the two, wait for them to sync, reset all the sequences, and rebuild indexes before doing the handover.

It works with basically no downtime but I agree this kind of thing should definitely be easier, even turnkey.

replies(1): >>41882711 #
48. paulryanrogers ◴[] No.41879746{5}[source]
Does pg_upgrade not do all that?

Or do you mean the new Pg server should transparently do the upgrade automatically? And while online?

replies(1): >>41886933 #
49. evanelias ◴[] No.41879932{5}[source]
That's a good list [1]. A handful of these are already doable in modern MySQL and/or MariaDB though.

JSON can often be used in place of arrays, and JSON_TABLE in both MySQL and MariaDB converts JSON into tabular data. MySQL supports multi-valued indexes over JSON, where each row can have multiple index entries (or no entries, e.g. partial index).

MariaDB has built-in convenience types for ipv4, ipv6, and uuid. Or in MySQL you can just use virtual columns to add human-readable conversions of binary columns, although that is admittedly slightly annoying.

MariaDB supports RETURNING.

[1] Edit to add: I do mean that honestly, it's an accurate and insightful list of nice Postgres features, most of which aren't in MySQL or MariaDB. Honestly baffled as to why I'm being downvoted.

replies(2): >>41880852 #>>41885699 #
50. djbusby ◴[] No.41880017{5}[source]
It's a step of your script or runbook. Right?
replies(1): >>41882655 #
51. Tostino ◴[] No.41880033[source]
As someone who has a decade of professional postgres experience (and built my company on it), and 6 years of SqlServer experience, I couldn't agree more with your comment. I really, really like the TDS wire protocol and what it enables.

I also really miss their query optimizer. It was just so damn smart.

replies(2): >>41880091 #>>41880212 #
52. chasil ◴[] No.41880091{3}[source]
I've only done it once, but doing an "rpm -Uvh" on the SQL Server binaries in Linux, then starting up the database did an immediate upgrade of the database files.

Oracle's DBUA is so much more painful.

53. Tostino ◴[] No.41880131{7}[source]
You would be very surprised to see the workflow of DB heavy development teams. Some oracle devs have entire backends coded in pl/sql.

My last company had an absolute ton of pl/pgsql written to support hundreds of ELT pipelines, migrations were all hand written and managed with liquibase.

There are more of them than you'd think out there. Just generally supporting some boring b2b or backend software.

54. whartung ◴[] No.41880212{3}[source]
I'm sorry, not to quibble about the quality of the TDS protocol, but what does it have to do with the upgradability of SqlServer?
replies(1): >>41880347 #
55. RealStickman_ ◴[] No.41880330{4}[source]
Using third party repos makes upgrading the OS in general more difficult though and is more dangerous than simply using what your distro provides.
56. Tostino ◴[] No.41880347{4}[source]
Nothing, they're unrelated.

However, I agree with OP that SQL Server does have a solid upgrade story and it's generally pain free.

They are just two aspects I really wish Postgres could do better on, and I miss working with those parts of SQL Server.

57. icedchai ◴[] No.41880421[source]
I worked at a place that was running a 8 year old install of Oracle 8i because upgrading was just too painful. Oracle 10g was out by that point. It's been over 15 years since... I wonder if they ever upgraded?
58. RegnisGnaw ◴[] No.41880486[source]
This has not been solved by logical replication. There are features in PostgreSQL not supported by logical replication. One of the issues I'm dealing with is our product switched from Oracle to PostgreSQL and the ORM stuff love LOBs, and you know LOBs can't be logical replicated.
59. ellisv ◴[] No.41880522[source]
> Of course if you don't care about downtime just shut down your database server, make a backup(!), and then do pg_upgrade.

I think this is the right answer for 99% of companies. Yes, everyone wants zero downtime, but I see incredibly few cases where it is actually worth the cost and effort.

60. fossdd ◴[] No.41880583[source]
> if you want to submit benchmarks, or you want to report bugs, an important or sometimes a mandatory component is to provide compilation flags

On all open-source distros, you can look into your distro's source and check their compile flags.

replies(1): >>41890515 #
61. ttfkam ◴[] No.41880852{6}[source]
> JSON can often be used in place of arrays

This is like storing UUIDs as text. You lose type information and validation. It's like storing your array as a comma-delimited string. It can work in a pinch, but it takes up more storage space and is far more error prone.

> convenience types for ipv4, ipv6, and uuid.

That's nice to see. A shame you have to decide ahead of time whether you're storing v6 or v4, and I don't see support for network ranges, but a definite improvement.

> MariaDB supports RETURNING.

That's honestly wonderful to see. Can these be used inside of CTEs as well for correlated INSERTs?

replies(1): >>41881536 #
62. jlokier ◴[] No.41880876{5}[source]
There is a generic way to do it that works with SQLite and other databases, including Postgres. On Linux, take a filesystem or block device coherent snapshot, take your backup of all the SQLite files from the snapshot, then delete the snapshot.

The app or service continues to run without disruption, and the backup is a coherent database image. Perhaps you have other state in addition to the database, or multiple databses. For example cache files. This covers them too, if they are all in the same snapshot.

There are many ways to take a snapshot: ZFS and btrfs offer a filesystem command, and any Linux filesystem, such as ext4, can be snapshotted with LVM or LVM-thin. Well known cloud providers like AWS, GCP, Azure also provide ways to snapshot block devices, through their APIs. However, to ensure a coherent image, it may be necessary to use the Linux `fsfreeze` command around API calls.

The database backup files can have incomplete transactions, but if the files are restored it will be as if the OS was abruptly stopped at the moment of the snapshot. SQLite and other good databases are designed to recover well from this sort of abrupt stop, without corrupting the database. They clean up incomplete transactions on recovery.

63. evanelias ◴[] No.41881536{7}[source]
Regarding using JSON for arrays, MySQL and MariaDB both support validation using JSON Schema. For example, you can enforce that a JSON column only stores an array of numbers by calling JSON_SCHEMA_VALID in a CHECK constraint.

Granted, using validated JSON is more hoops than having an array type directly. But in a pinch it's totally doable.

MySQL also stores JSON values using a binary representation, it's not a comma-separated string.

Alternatively, in some cases it may also be fine to pack an array of multi-byte ints into a VARBINARY. Or for an array of floats, MySQL 9 now has a VECTOR type.

Regarding ipv6 addresses: MariaDB's inet6 type can also store ipv4 values as well, although it can be inefficient in terms of storage. (inet6 values take up a fixed 16 bytes, regardless of whether the value is an ipv4 or ipv6 address.)

As for using RETURNING inside a writable CTE in MariaDB: not sure, I'd assume probably not. I must admit I'm not familiar with the multi-table pipeline write pattern that you're describing.

replies(1): >>41883536 #
64. danudey ◴[] No.41881588{4}[source]
MySQL, for all its faults, actually handles this quite well. Generally speaking, assuming there aren't any major backwards-incompatibilities (which there haven't often been in my experience), you can start up the old server in place of the new server and you're good to go. Some table-rebuilding can be done in-place if you need to do things like recreate indices, but _generally_ you're good to go from there.

For more complex deployments, cases of backwards-incompatibilities in the data structure, or where I had reason to rebuild many large tables, you set up a replication secondary, do pre-upgrade migrations (e.g. in the case of deprecated functionality), upgrade the secondary, do post-upgrade migrations (e.g. in the case of new functionality), and wait for replication to catch up 100%. Then, after sending traffic to it for a bit to ensure it works correctly, you swap the primary and secondary and do it again.

65. danudey ◴[] No.41881646{5}[source]
It's very rare that the client protocol changes dramatically and in an incompatible way, so typically nothing changes and no one notices.

I've seen a few rare cases where an update to the client library is required for improvements to the protocol, e.g. when MySQL switched to the new password/authentication format, but that was also backwards-compatible: the old format was used until an account's password was changed, password changes could be told to use the legacy format instead, and you could set the legacy format by default. The only issue that occurred (and it did occur) was old client libraries + new server password hashes, but, again, in this case it didn't affect existing installations/users/passwords.

66. DavidPP ◴[] No.41881649[source]
Thanks for the link, wasn't aware of this solution.
67. danudey ◴[] No.41881690{3}[source]
MySQL feels more "modern" (in the context of the early 2000s when I started using it), in that it seemed to know it existed in an environment with other tools; for example, authentication was handled internally and was unrelated to the system user (other than the clients usually using your system username as the default database username if you didn't specify one).

Compare that with Postgres, which seemed very "old school", going so far as to assume it was the only thing a given server was doing. Connecting to postgres authenticated as your own user; creating a user was done with the `createuser` command (or similar, I don't remember what it was actually called), and not some namespaced `pg_createuser` command that would make it clear what it did.

I also remember setting up MySQL replication with almost no effort whatsoever, and then in the same year trying to set up Postgres replication - which it didn't have. I was told by other postgres admins to "just set up a script to rsync the database over and over to the other server; then if your first server dies just start the second server up and it'll recover". This seemed like a wildly cavalier attitude towards uptime and reliability, not to mention generating a ridiculous amount of I/O and network traffic for minimal benefit.

68. sgarland ◴[] No.41882079{5}[source]
That is a well-thought out list, and you’re clearly aware of and take advantage of the DB’s capabilities. Seriously, congrats. Especially RETURNING – it’s always baffling to me why more people don’t use it (or its sad cousin in MySQL that lets you get the last inserted rowid if using an auto-increment).

Most devs I’ve worked with don’t know about aggregations beyond COUNT and GROUP BY, and do everything in the app. I’ve pointed these out before, and am always told, “we don’t want to have additional logic in the DB.” So you want a dumb bit box then, got it – why are you using an RDBMS?

> Transactional DDL

I know this is a thing, and I’ve used it in Postgres, but I’ve also never found myself in MySQL being upset that I didn’t have it. Everything should be thoroughly tested in staging before prod.

> RLS

Yes, amazing feature if you use it correctly.

I will give MySQL a couple of nods in its favor: well, three.

1. Clustering index. If you design your schema around this fact, range queries can be WAY faster. Dropbox does this, IIRC.

2. Generated Virtual Columns. Not sure why Postgres still doesn’t support this.

3. ON UPDATE CURRENT_TIMESTAMP. It’s so nice to have the DB automatically track update times for you, IMO.

replies(1): >>41883378 #
69. sgarland ◴[] No.41882144{5}[source]
It’s more like a “not knowing how fast something should be” in the case of under-provisioning, and “not knowing or caring to look at metrics” for over-provisioning.

I once was examining some queries being generated via Prisma, and found it was using LIMIT/OFFSET for pagination. I pointed this out to the devs, who replied that the query times were acceptable for their SLOs. I guess if you don’t know that a simple SELECT can and should often be in the sub-msec range, you might not be concerned that it’s taking 100 msec.

The other is just the normalization of cloud expenditure. Every large org has some kind of pricing agreement with their provider, and so are rarely incentivized to cut costs, since they have a minimum spend.

replies(1): >>41883130 #
70. literalAardvark ◴[] No.41882655{6}[source]
It is, but it also takes a very long time during which production data integrity can be severely affected.
71. literalAardvark ◴[] No.41882680{6}[source]
Not true. Those scripts only cover problems caused by PostgreSQL itself, not, for example, a change in collations that will silently break your indexes, such as the one in Debian 12.
72. literalAardvark ◴[] No.41882711{6}[source]
I feel that really should be included in core, yes.

I get why it wasn't, but logical replication has been production ready for a while now, so it really should have at least a little sugar.

replies(1): >>41890711 #
73. ttfkam ◴[] No.41883130{6}[source]
I hear you.

When all you know is an ORM, you tend to treat SQL databases like dumb bit bucket add-ons to your app server. It's amazing how much potential performance and scalability are left on the floor because app developers can't shift their mindset when needed. Objects/structs cannot be assumed to map 1:1 with relations. What a world we'd live in if devs spent even 1/10 the effort examining their relational schema design that they spend arguing over whether a set, a list, or a queue is better for a given situation. It's like thoughts on Big-O stop cold at the database driver interface.

74. ttfkam ◴[] No.41883378{6}[source]
I use transactional DDL all the time, even during development. It's nice to not have to fully reset your schema every time you test a migration file locally. With transactional DDL, you run the whole list, and if any fails, it rolls back to where you started. You look at the error, edit your migration, and try again. It really is a time saver. There is a peace of mind always knowing your schema is in a consistent state not unlike the peace that comes from a good set of unit tests.

1. Yep, I definitely miss clustering indexes in Postgres sometimes. I can sometimes fake it with covering indexes when all I want are an extra column or two along with the primary key or similar without seeking to the main table, but you're right about that MySQL/MariaDB win here.

2. The dynamic computed column is an easy workaround with immutable functions that take the record as a param.

        CREATE TABLE foo ( a int, b int, c int );
        CREATE FUNCTION d(entry foo) RETURNS int LANGUAGE sql IMMUTABLE AS $$
          SELECT foo.a + foo.b + foo.c;
        $$;

        SELECT a, b, c, d(foo) FROM foo;
It's not part of the table schema when doing a SELECT *, but it is just as efficient as a computed column in MySQL/MariaDB and only slightly more verbose.

3. ON UPDATE CURRENT_TIMESTAMP works in Postgres with a trigger function, which you can reuse if all your tables use the same name for your "last_modified" column (probably a good idea anyway). Not as convenient as the declarative syntax, but it's a fairly trivial workaround.

        CREATE OR REPLACE FUNCTION update_last_modified() RETURNS TRIGGER AS $$
          BEGIN
            NEW.last_modified = now();
            RETURN NEW;   
          END;
        $$ language 'plpgsql';

        CREATE TRIGGER foo_last_modified BEFORE UPDATE ON foo
        FOR EACH ROW EXECUTE PROCEDURE update_last_modified();

        CREATE TRIGGER bar_last_modified BEFORE UPDATE ON bar
        FOR EACH ROW EXECUTE PROCEDURE update_last_modified();
One function, many triggers. You also get to choose between "when transaction started" (now() or CURRENT_TIMESTAMP), "when statement started" (statement_timestamp()), or "right now" (clock_timestamp()).

I don't mind workarounds so much as functionality that simply cannot be replicated. For example I miss real temporal table support in Postgres like what you can find in MariaDB or MS SQL Server. The painful kludges for missing PIVOT support like in MS SQL Server is another one.

You never know how much you need deferred foreign key constraints until you don't have them anymore. Or a materialized view.

replies(1): >>41883889 #
75. ttfkam ◴[] No.41883536{8}[source]
> the multi-table pipeline write pattern

        WITH new_order AS (
            INSERT INTO order (po_number, bill_to, ship_to)
                 VALUES ('ABCD1234', 42, 64)
              RETURNING order_id
        )
        INSERT INTO order_item (order_id, product_id, quantity)
             SELECT new_order.order_id, vals.product_id, vals.quantity
               FROM (VALUES (10, 1), (11, 5), (12, 3)) AS vals(product_id, quantity)
              CROSS JOIN new_order
        ;
Not super pretty, but it illustrates the point. A single statement that creates an order, gets its autogenerated id (bigint, uuid, whatever), and applies that id to the order items that follow. No network round trip necessary to get the order id before you add the items, which translates into a shorter duration for the transaction to remain open.
replies(1): >>41883777 #
76. ttfkam ◴[] No.41883630{5}[source]
> the old central gate keeping DBA team model

I have mixed feelings about this. On the one hand I agree that ownership should be shared. On the other, app developers really don't consider their data structures as carefully in SQL as they do in-memory. It's odd. The right data structure matters more than a good algorithm since algorithms are easier to change. Once you settle on a list vs a set vs a queue, you're stuck once code is built around it.

The same is doubly true for the database schema. Lack of planning and knowledge of expected access patterns can turn an otherwise fast database to mud in no time flat. Once your data is in there, changing the schema is exponentially harder.

"I’m a huge proponent of designing your code around the data, rather than the other way around, and I think it’s one of the reasons git has been fairly successful… I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships." – Linus Torvalds (2006)

What is your database but a bunch of data structures and relationships? I get why the gatekeeping occurred. I don't agree with it, but I understand it. Far too many folks consider expertise in data stores to be optional as developers.

replies(2): >>41887467 #>>41888291 #
77. ttfkam ◴[] No.41883681{8}[source]
> For a UUIDv4 PK

To be fair, this blows out any db that supports clustered indexes as well. Non-k-sortable primary keys are just a bad idea all around.

With UUIDv7, the WAL write amplification problem goes away just as the clustered index issues do.

replies(1): >>41887450 #
78. ttfkam ◴[] No.41883719{7}[source]
> Easy enough: almost no one writes SQL queries by hand these days, not for querying the database nor for doing schema upgrades. It's all done by tools

Your experience does not match mine. Tools like ORMs make horrible schemas in my opinion that cater to the lowest common denominator of SQL engine functionality. This means leaving a lot of performance and scalability on the floor. In order to make the ORMs generate decent schema definitions, you need to know the underlying engine and therefore SQL. At that point, you might as well use SQL.

Ever try changing a column's data type from a table with hundreds of millions of rows with an ORM definition file? Hope you like downtime.

79. evanelias ◴[] No.41883777{9}[source]
Thanks, that makes sense.

In this specific situation, the most common MySQL/MariaDB pattern would be to use LAST_INSERT_ID() in the second INSERT, assuming the order IDs are auto-increments. Or with UUIDs, simply generating the ID prior to the first INSERT, either on the application side or in a database-side session variable.

To avoid extra network calls, this could be wrapped in a stored proc, although a fair complaint is that MySQL doesn't support a ton of different programming langauges for procs/funcs like Postgres.

80. evanelias ◴[] No.41883889{7}[source]
> It's nice to not have to fully reset your schema every time you test a migration file locally

In terms of dev flow, this is only a problem with imperative migration systems. Declarative schema management tools solve it by being able to transition any live database state into the desired state, which is expressed by a repo of CREATE statements.

If something fails, you fix the bad CREATE and run the tool again, and it effectively picks up where it left off. And well-designed declarative tools catch many problems pre-flight anyway by running a suite of linters, running the emitted SQL statements in a sandbox first, etc.

If the tool's diff returns clean, you know your schema is in the right state on the DB.

Ironically, lack of transactional DDL actually makes declarative schema management more straightforward in MySQL/MariaDB: you can't mix DDL and DML there anyway, so it's more natural to handle schema changes vs data migrations using different tools/pipelines.

81. elcritch ◴[] No.41884461[source]
Financial incentives likely, because PG is developed by companies that make money via providing support. Doesn't have to be an insidious plot but just why work on something you know well and that makes you money. MSSQL wants people to pay to upgrade, it behooves them to make it seemless.
82. lazide ◴[] No.41885532[source]
I suspect it’s also a bit of stockhold syndrome on behalf of admins haha
replies(1): >>41886385 #
83. daleswanson ◴[] No.41885684[source]
What do you mean? You can manually run Heroku database maintenance. We just did this recently at work.

https://devcenter.heroku.com/articles/heroku-postgres-mainte...

84. ComputerGuru ◴[] No.41885699{6}[source]
You still can’t use uuid as proper foreign keys with validation on mariaDB/MySQL though, right? It wasn’t possible with blobs at any rate.
replies(1): >>41890351 #
85. ktosobcy ◴[] No.41886385{3}[source]
Or better yet - "guarding the job", i.e. making their job very important ;)
replies(1): >>41886442 #
86. lazide ◴[] No.41886442{4}[source]
Very true. At a prior company, we rolled out a simpler and easier to use admin console for our enterprise product and all the customer admins hated it.

Because now it was easy to use hah.

We ended up rolling it back because the hate was so consistent.

(And no, this wasn’t us guessing or making this up - a number of them were quite explicit that we were threatening their roles, and to stop it.)

87. jeltz ◴[] No.41886933{6}[source]
Yes, and it does it by starting an instance of the old version and runs pg_dump against it. And that was one thing the original poster complained about.
replies(1): >>41896911 #
88. water9 ◴[] No.41887158{5}[source]
Yeah, but what is your dataset is partitioned horizontally? It’s very smooth until it’s not. For example, if you are using the postGIS extensions and something changes with the new version that could be a showstopper
89. magicalhippo ◴[] No.41887428{4}[source]
We have customers with not quite TB levels, but at least several hundreds GB databases.

Upgrading the database server is as simple as shutting down the service, installing new executables and restarting the service. Downtime is measured in minutes.

When upgrading major versions, one might not get full functionality of the new features unless one upgrades the on-disk format (unload-reload), but you can run the new server version with older on-disk format just fine.

90. sgarland ◴[] No.41887450{9}[source]
Agreed. My preferred PK in descending order is: natural key if it makes sense and would speed up queries, integer of an appropriate size, UUIDv7. I only rank it below integers because at best, they’re 16 bytes, and even a BIGINT is only 8 bytes.
replies(1): >>41905388 #
91. sgarland ◴[] No.41887467{6}[source]
100% on all points. I’m a fan of gatekeeping things that are hard to get right, and hard to undo. If you can prove that you know what you’re doing, by all means, own your stuff. But until then, yes, I’d like to watch over your shoulder, and hopefully teach you how to do it right.

The fact that DBs are data structures but are ignored has always irritated me. If I pushed a PR using lists for everything regardless of their practicality, I’d be rightly told to try again. But push a PR with a schema using similar suboptimal choices, and no one blinks an eye.

92. dalyons ◴[] No.41888291{6}[source]
I agree but also I think I could be clearer about the key advantage of the distributed many dbs model… it’s that I don’t have to care if people are good at anything that you said :). If teams want to do dumb things with their schema, fine, that’s on them and they’ll have to deal with the consequences. If it matters, they’ll learn and get better. If they want to buy their way out of it with their latency and infra budget instead, that’s honestly fine too.

With many smaller databases owned separately, the blast radius of bad db decisions is small/local.

93. whizzter ◴[] No.41888427{4}[source]
What's lacking? Specific types or schema changes?
replies(1): >>41899965 #
94. samlambert ◴[] No.41890264{4}[source]
this simply can’t be true. there are endless stories of people moving from postgres to mysql because of reliability and operational issues. postgres has only had a mature replication solution for 7 years. mysql had it in 1999.
95. evanelias ◴[] No.41890351{7}[source]
This has always been possible, for example using the BINARY(16) column type if you want to be efficient. Or in MariaDB 10.7+ you can now use the dedicated UUID column type, which is equivalent to BINARY(16) under the hood, but provides a human-readable hex value when queried.

UUIDs are fixed-length. Blobs are not the appropriate type for that.

96. crabbone ◴[] No.41890515{3}[source]
The idea here is more that you need to adjust these flags for the benchmark to make sense.

Take fore example the aforementioned block size. Say, you have a storage with 4x block size of the one used by PostgreSQL. If you run a benchmark over such a storage, you'll have insane write amplification. Nobody will consider results of such a "benchmark" because that's simply a misconfiguration of the program you are trying to measure.

More generally, Linux distros will tend to compile distributed binaries with "safe" defaults s.t. run on most h/w users can have, and this means optimizing for the lowest common denominator. Looping back to PostgreSQL, the default for block size was for a long time 4k, and iirc today it's 8k. This is fine, if we are talking about plain SSD / HDD, but with enterprise SDS, these are "rookie numbers", even the maximum supported by PosgreSQL (32k) is still a "rookie number", but it's still four times better than the default!

97. crabbone ◴[] No.41890604{4}[source]
> This way when you upgrade your OS

Don't forget to do this on Friday afternoon! :D

Upgrading OS in production environment... this reminds me of a joke about police academy intake test:

The participants were offered a board with a round and a square hole and two pegs of similar shape. After the test the new recruits were sorted into two groups: very smart and very strong

So, yeah, you ought to be either very smart or very strong to upgrade OS in production environment, and then also discover changes to your database as you go along.

That's not to say that upgrades don't happen at all... but you'd be upgrading in a testing environment ten times before you try that "for real", and at that point you would have probably solved the question of whether you need to (re)install the database and how :)

98. nijave ◴[] No.41890711{7}[source]
>should have at least a little sugar.

I think that sums up PG pretty well. It seems there's a lot of things that lean into it the Unix "just use another tool" philosophy that ends up making management more difficult.

99. paulryanrogers ◴[] No.41896911{7}[source]
Are pg_upgrade's docs inaccurate?

It says it works "without the data dump/restore" and...

> Major PostgreSQL releases regularly add new features that often change the layout of the system tables, but the internal data storage format rarely changes. pg_upgrade uses this fact to perform rapid upgrades by creating new system tables and simply reusing the old user data files

Regardless, I suppose it is that reliance on the unchanging of the internal data format which is limiting what refactors can do.

100. RegnisGnaw ◴[] No.41899965{5}[source]
I don't care much about schema changes as I can setup logical replication just for the major version upgrade. The main issue is the lack of support for large objects, which the code bases I'm dealing with uses heavily.
101. ttfkam ◴[] No.41905388{10}[source]
Contrary to intuition, bigint ends up only 25% smaller on disk than UUIDv7 and no improvement in speed. Honestly in 2024, 32-bit ints are generally a poor fit. Either the table is small enough to be considered a lookup table (16-bit int) or could grow to unknown bounds given enough time (64-bit int). Sequential ids are guessable (which could be a security issue), cannot be generated on the client, and can hinder sharding now that Postgres supports bidirectional replication (multiple writers).

https://ardentperf.com/2024/02/03/uuid-benchmark-war/

My PK preference in descending order: natural key if it makes sense and would speed up queries, 16-bit for unambiguous lookup tables with a hard finite limit of values, UUIDv7, 64-bit sequence, UUIDv4, and then finally 32-bit sequence.

UUIDv7 for performance, reasonable non-guessability, client-generation, and potential for sharding. 64-bit for cases that really need a sequence. UUIDv4 when randomness is required for security. 32-bit when there's a hard finite limit that can guaranteed to be below 2 billion even given the most optimistic of projections.

If the database is small, it doesn't matter. If it's large, it is 99.999% unlikely the primary key will be anything more than a rounding error in storage, access time, or cost. No one ever said, "We would have made payroll if only we had used 32-bit primary keys."

To really save space, extract the timestamp in the UUIDv7 and use an expression index to use as the creation date. Then you're not "wasting" any storage space. Personally I don't think the storage optimization is necessary. Premature optimization being evil and all that.