Most active commenters

ttfkam(10)
sgarland(7)
evanelias(5)
literalAardvark(3)

Popular/hot comments

>>41878653 #
>>41878750 #

←back to thread

Why does everyone run ancient Postgres versions?

(neon.tech)

Show context

noirscape ◴[18 Oct 24 08:45 UTC] No.41877487[source]▶

>>41873957 (OP) #

Because the actual process of upgrading Postgres is terrible.

I say this as someone who absolutely loves using it, but the actual process of upgrading Postgres is something that takes significant downtime, is error-prone and you're often better off just dumping all the databases and reimporting them in a new folder. (A good idea in general since it'll vacuum and compact indexes as well if you do it, combining a few maintenance routines in your upgrade path.)

It requires having the previous version of Postgres installed, something which can mess with a number of distro policies (not to mention docker, which is the most popular way to deploy software that will typically rely on Postgres), and unlike most software with that issue, Postgres is software you want to be managed by your distro.

Therefore, most people only upgrade by necessity - when their distro forces the upgrade or the version they're using reaches EOL.

replies(11): >>41877841 #>>41877921 #>>41877992 #>>41878101 #>>41878462 #>>41878670 #>>41879013 #>>41879161 #>>41879191 #>>41879259 #>>41879567 #

MichaelZuo ◴[18 Oct 24 10:02 UTC] No.41877841[source]▶

>>41877487 #

So the real question is, why is the upgrade process so incompetently designed, and why has no one fixed this?

replies(5): >>41877898 #>>41877902 #>>41877926 #>>41878252 #>>41878442 #

1. phil21 ◴[18 Oct 24 10:14 UTC] No.41877902[source]▶

>>41877841 #

My opinion is Postgres was designed by software developers for software developers. The split on “which relational database to use” in my career has almost always been perfectly split between SWE vehemently demanding pgsql for the feature set, and the sysadmins having to support maintenance and production availability preferring MySQL.

One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level. Apparently us lowly sysadmins may have had a point after all.

This is a bit tongue in cheek but really not far from my lived reality. When the focus is on features and “correctness” at the near total expense of sane systems tooling folks can develop some myopia on the subject. So many arguments with devs on my teams over this subject that were utterly horrified to find we were running MySQL for a given service.

Open source projects tend to fix the pain points its contributors experience, and I assume there were not too many contributors wanting to deal with the boring work of making administration and easy task - it’s thankless “sideways” work that won’t result in many accolades or personal satisfaction for most SWEs.

The end users are almost always developers, most of whose experiences in production entail either the equivalent of a docker container level scale system, or are simply given a connection string and the rest is a black box to them. Under those contexts I’d personally prefer Postgres as well and it wouldn’t even be close. When you get into backups, clustering, upgrades, and high availability under extreme load? IMO the story falls apart real fast.

replies(5): >>41878214 #>>41878540 #>>41878590 #>>41878653 #>>41881690 #

2. mxey ◴[18 Oct 24 10:59 UTC] No.41878214[source]▶

>>41877902 (TP) #

As someone who operates both I much prefer to run pg_upgrade every few years rather than dealing with mysqldump, replication issues and babysitting Orchestrator.

replies(1): >>41878946 #

3. jeltz ◴[18 Oct 24 11:55 UTC] No.41878540[source]▶

>>41877902 (TP) #

That has not been my experience at all. The sysadmins I have worked with have strongly preferred PostgreSQL over running MySQL while the developers have cared, but less so. The reason is that when something goes wrong PostgreSQL is much easier to diagnose. So while some tasks like upgrades take less manual effort with MySQL PostgreSQL is more stable and gives better error messages for when you get paged in the middle of the night.

PostgreSQL has prioritized correctness and stability which while it has made certain features more clunky to use or taken longer time for them to be implemented, when you get paged n the middle of the night you get the time you spent back. PostgreSQL also has spent a lot of time on improving DBA experience so mixed DBA/sysadmins also usually prefer PG.

replies(1): >>41890264 #

4. eru ◴[18 Oct 24 12:01 UTC] No.41878590[source]▶

>>41877902 (TP) #

Of course, sqlite is even easier on the sys-admins, (but not necessarily the right tool for the job.)

replies(1): >>41878750 #

5. sgarland ◴[18 Oct 24 12:12 UTC] No.41878653[source]▶

>>41877902 (TP) #

> The split on “which relational database to use” in my career has almost always been perfectly split between SWE vehemently demanding pgsql for the feature set

I’ve seen this as well, but when pressed, none of them could articulate what part of its feature set they actually needed to use.

> One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level.

Are you (or your devs, rather) actually running your own DBs? If so, respect. My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks, at which point they might ask for help, or they might just make it bigger.

replies(3): >>41878860 #>>41879564 #>>41879577 #

6. gramakri2 ◴[18 Oct 24 12:24 UTC] No.41878750[source]▶

>>41878590 #

How does one backup a sqlite without stopping the app or disrupting service? I couldn't find a simple answer to this

replies(3): >>41878854 #>>41879410 #>>41880876 #

7. tommy_axle ◴[18 Oct 24 12:38 UTC] No.41878854{3}[source]▶

>>41878750 #

With sqlite3 -backup There's an online backup api

8. dalyons ◴[18 Oct 24 12:39 UTC] No.41878860[source]▶

>>41878653 #

> My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks, at which point they might ask for help, or they might just make it bigger.

Yep that’s exactly what I’ve seen too :). I still overall prefer this distributed database model - yes you spend more and people make mistakes (and learn). But if you can afford it you get higher velocity and more incentive aligned ownership than the old central gate keeping DBA team model.

replies(2): >>41878987 #>>41883630 #

9. literalAardvark ◴[18 Oct 24 12:49 UTC] No.41878946[source]▶

>>41878214 #

just as long as you don't forget to reindex after

replies(2): >>41879466 #>>41880017 #

10. sgarland ◴[18 Oct 24 12:54 UTC] No.41878987{3}[source]▶

>>41878860 #

I don’t mind the model IFF the team has interest in learning how to do it correctly. My biggest complaint as both an SRE and now DBRE has been that dev-managed infrastructure inevitably means during an incident that I had nothing to do with, I’ll be paged to fix it anyway. Actually, that’s not the problem; the problem is later when I explain precisely how and why it broke, and how to avoid it in the future, there’s rarely any interest in doing so.

“You have an unpartitioned table with a few billion rows and a UUIDv4 PK. I’m amazed it’s been working as long as it has. You need to change your schema.”

“Or we can just buy a bigger instance.”

“…”

Rinse and repeat six months later. I’m aware this is an organizational problem, but from what I’ve seen, it’s endemic.

Re: incentives, yes, also a problem. Dev teams are generally incentivized by Product, who doesn’t care at all whether or not something is optimal, only that new features are being shipped. I despise this mentality, but it’s not usually the devs fault.

replies(2): >>41879210 #>>41879447 #

11. necovek ◴[18 Oct 24 13:22 UTC] No.41879210{4}[source]▶

>>41878987 #

UUID (version does not matter for storage, only for generation and distribution) is basically a 128-bit unsigned int, so a double "word" on 64-bit platforms, and it's natively supported by Postgres since at least 8.3 (earliest version with docs up).

While most versions ensure it's random, there are plenty of indexing algorithms that make searching through that quick and close to O(1), so that should not be the schema problem.

Unless you used a string field, but there is a quick workaround for that with Postgres too (make an index on `UUID(field)`, and look it up by `UUID(value) = UUID(field)`).

That's why both "devops" (DBAs?) and "devs" prefer Postgres over many other databases: you can easily handle some small mistakes in schemas too while you do the full migration in parallel.

replies(1): >>41879419 #

12. zaphar ◴[18 Oct 24 13:43 UTC] No.41879410{3}[source]▶

>>41878750 #

The typical solution is https://litestream.io/. Comparatively easy to setup for most usages of sqlite.

13. sgarland ◴[18 Oct 24 13:44 UTC] No.41879419{5}[source]▶

>>41879210 #

Who said this was Postgres? MySQL (with the default InnoDB engine) and MSSQL both are clustering indexes; they store tuples around the PK. For a UUIDv4 PK – or anything else non-k-sortable, for that matter – this results in a massive amount of B+tree bloat from the random inserts.

But sure, let’s talk about Postgres. After all, it stores tuples in a heap, and so is immune to this behavior.

Except that its MVCC implementation means that it has to maintain a Visibility Map [0] to keep track of which pages contain only tuples which are visible to every active transaction. This is nominally used for vacuum decisions, but is also cleverly exploited for use with index-only scans. This poses a problem when referencing non-k-sortable entries, because while the tuples are in a heap, the indices are generally a B+tree. So now we’re back to the problem of massive IO amplification. Even if the VM and index are entirely cached, reads are reads, and they add up.

Then there’s the issue of WAL bloat due to full page writes. tl;dr Postgres pages are nominally 8 KiB, and storage device pages are nominally 4 KiB. To guarantee an atomic write, Postgres writes the entire page for a given tuple for the first write after checkpoint, regardless of how many bytes were altered. Again, non-k-sortable: if your tuples are spread randomly across pages, you now have far more pages to write. This can and does matter for either write-heavy workloads, instances with limited network bandwidth (oh hi small RDS / Aurora), or the worst, both combined.

Re: search complexity, I’m pretty sure B+trees (and B-trees) have O(log n) time complexity, not O(1). Whether or not that’s “close” depends on n, I suppose, but in this scenario I specifically said “billions of rows.”

> That's why both "devops" (DBAs?) and "devs" prefer Postgres over many other databases

I’m a DBRE, and like both MySQL and Postgres. They both have strengths and weaknesses, but you need to deeply understand those – and properly design your schema and query patterns around them – in order to make an informed decision.

[0]: https://www.postgresql.org/docs/current/storage-vm.html

replies(1): >>41883681 #

14. mschuster91 ◴[18 Oct 24 13:48 UTC] No.41879447{4}[source]▶

>>41878987 #

> Rinse and repeat six months later. I’m aware this is an organizational problem, but from what I’ve seen, it’s endemic.

Easy enough: almost no one writes SQL queries by hand these days, not for querying the database nor for doing schema upgrades. It's all done by tools - Doctrine in the PHP world for example. And pretty much no one but actual CS graduates knows anything deeper about databases.

Result is, devs are happy enough if they found something that works, and don't want to risk being the one who broke prod because they applied some schema change suggested by their DBA who doesn't know some random thing about the application.

replies(2): >>41880131 #>>41883719 #

15. mxey ◴[18 Oct 24 13:50 UTC] No.41879466{3}[source]▶

>>41878946 #

> All failure, rebuild, and reindex cases will be reported by pg_upgrade if they affect your installation; post-upgrade scripts to rebuild tables and indexes will be generated automatically.

replies(1): >>41882680 #

16. ttfkam ◴[18 Oct 24 14:01 UTC] No.41879564[source]▶

>>41878653 #

> none of them could articulate what part of its feature set they actually needed to use.

Transactional DDL: migration errors never leave the database in an intermediate/inconsistent state.

Range types + exclusion constraint: just no way to do this in MySQL without introducing a race condition.

Writeable CTEs: creating insert/update/delete pipelines over multiple tables deterministically. Seriously though, the RETURNING clause is something I use all the time both in and out of CTEs.

Filtered aggregates and grouping sets: cleanly get multiple data points for a dashboard in one shot.

Unnest: converting arrays into a set of rows. Inverse of array_agg(...).

Types: arrays, booleans, IP/subnets, UUIDs (without binary(16) hacks), etc.

Materialized views: seriously, how does MySQL not have this yet?

Statement-level triggers: another option from per-row.

Row-level security: setting data visibility based on configurable policies.

I can cite specific use cases I've deployed to production for each of these and more.

replies(2): >>41879932 #>>41882079 #

17. ttfkam ◴[18 Oct 24 14:03 UTC] No.41879577[source]▶

>>41878653 #

> My experience has been that they spin up either a hideously under or over-provisioned RDS or Aurora instance, and then never touch it until it breaks

That's a true shame considering how easy it is to make a read replica of any size and then fail over to it as the new primary. Definite skill issues.

replies(1): >>41882144 #

18. evanelias ◴[18 Oct 24 14:47 UTC] No.41879932{3}[source]▶

>>41879564 #

That's a good list [1]. A handful of these are already doable in modern MySQL and/or MariaDB though.

JSON can often be used in place of arrays, and JSON_TABLE in both MySQL and MariaDB converts JSON into tabular data. MySQL supports multi-valued indexes over JSON, where each row can have multiple index entries (or no entries, e.g. partial index).

MariaDB has built-in convenience types for ipv4, ipv6, and uuid. Or in MySQL you can just use virtual columns to add human-readable conversions of binary columns, although that is admittedly slightly annoying.

MariaDB supports RETURNING.

[1] Edit to add: I do mean that honestly, it's an accurate and insightful list of nice Postgres features, most of which aren't in MySQL or MariaDB. Honestly baffled as to why I'm being downvoted.

replies(2): >>41880852 #>>41885699 #

19. djbusby ◴[18 Oct 24 14:55 UTC] No.41880017{3}[source]▶

>>41878946 #

It's a step of your script or runbook. Right?

replies(1): >>41882655 #

20. Tostino ◴[18 Oct 24 15:09 UTC] No.41880131{5}[source]▶

>>41879447 #

You would be very surprised to see the workflow of DB heavy development teams. Some oracle devs have entire backends coded in pl/sql.

My last company had an absolute ton of pl/pgsql written to support hundreds of ELT pipelines, migrations were all hand written and managed with liquibase.

There are more of them than you'd think out there. Just generally supporting some boring b2b or backend software.

21. ttfkam ◴[18 Oct 24 16:19 UTC] No.41880852{4}[source]▶

>>41879932 #

> JSON can often be used in place of arrays

This is like storing UUIDs as text. You lose type information and validation. It's like storing your array as a comma-delimited string. It can work in a pinch, but it takes up more storage space and is far more error prone.

> convenience types for ipv4, ipv6, and uuid.

That's nice to see. A shame you have to decide ahead of time whether you're storing v6 or v4, and I don't see support for network ranges, but a definite improvement.

> MariaDB supports RETURNING.

That's honestly wonderful to see. Can these be used inside of CTEs as well for correlated INSERTs?

replies(1): >>41881536 #

22. jlokier ◴[18 Oct 24 16:22 UTC] No.41880876{3}[source]▶

>>41878750 #

There is a generic way to do it that works with SQLite and other databases, including Postgres. On Linux, take a filesystem or block device coherent snapshot, take your backup of all the SQLite files from the snapshot, then delete the snapshot.

The app or service continues to run without disruption, and the backup is a coherent database image. Perhaps you have other state in addition to the database, or multiple databses. For example cache files. This covers them too, if they are all in the same snapshot.

There are many ways to take a snapshot: ZFS and btrfs offer a filesystem command, and any Linux filesystem, such as ext4, can be snapshotted with LVM or LVM-thin. Well known cloud providers like AWS, GCP, Azure also provide ways to snapshot block devices, through their APIs. However, to ensure a coherent image, it may be necessary to use the Linux `fsfreeze` command around API calls.

The database backup files can have incomplete transactions, but if the files are restored it will be as if the OS was abruptly stopped at the moment of the snapshot. SQLite and other good databases are designed to recover well from this sort of abrupt stop, without corrupting the database. They clean up incomplete transactions on recovery.

23. evanelias ◴[18 Oct 24 17:23 UTC] No.41881536{5}[source]▶

>>41880852 #

Regarding using JSON for arrays, MySQL and MariaDB both support validation using JSON Schema. For example, you can enforce that a JSON column only stores an array of numbers by calling JSON_SCHEMA_VALID in a CHECK constraint.

Granted, using validated JSON is more hoops than having an array type directly. But in a pinch it's totally doable.

MySQL also stores JSON values using a binary representation, it's not a comma-separated string.

Alternatively, in some cases it may also be fine to pack an array of multi-byte ints into a VARBINARY. Or for an array of floats, MySQL 9 now has a VECTOR type.

Regarding ipv6 addresses: MariaDB's inet6 type can also store ipv4 values as well, although it can be inefficient in terms of storage. (inet6 values take up a fixed 16 bytes, regardless of whether the value is an ipv4 or ipv6 address.)

As for using RETURNING inside a writable CTE in MariaDB: not sure, I'd assume probably not. I must admit I'm not familiar with the multi-table pipeline write pattern that you're describing.

replies(1): >>41883536 #

24. danudey ◴[18 Oct 24 17:38 UTC] No.41881690[source]▶

>>41877902 (TP) #

MySQL feels more "modern" (in the context of the early 2000s when I started using it), in that it seemed to know it existed in an environment with other tools; for example, authentication was handled internally and was unrelated to the system user (other than the clients usually using your system username as the default database username if you didn't specify one).

Compare that with Postgres, which seemed very "old school", going so far as to assume it was the only thing a given server was doing. Connecting to postgres authenticated as your own user; creating a user was done with the `createuser` command (or similar, I don't remember what it was actually called), and not some namespaced `pg_createuser` command that would make it clear what it did.

I also remember setting up MySQL replication with almost no effort whatsoever, and then in the same year trying to set up Postgres replication - which it didn't have. I was told by other postgres admins to "just set up a script to rsync the database over and over to the other server; then if your first server dies just start the second server up and it'll recover". This seemed like a wildly cavalier attitude towards uptime and reliability, not to mention generating a ridiculous amount of I/O and network traffic for minimal benefit.

25. sgarland ◴[18 Oct 24 18:23 UTC] No.41882079{3}[source]▶

>>41879564 #

That is a well-thought out list, and you’re clearly aware of and take advantage of the DB’s capabilities. Seriously, congrats. Especially RETURNING – it’s always baffling to me why more people don’t use it (or its sad cousin in MySQL that lets you get the last inserted rowid if using an auto-increment).

Most devs I’ve worked with don’t know about aggregations beyond COUNT and GROUP BY, and do everything in the app. I’ve pointed these out before, and am always told, “we don’t want to have additional logic in the DB.” So you want a dumb bit box then, got it – why are you using an RDBMS?

> Transactional DDL

I know this is a thing, and I’ve used it in Postgres, but I’ve also never found myself in MySQL being upset that I didn’t have it. Everything should be thoroughly tested in staging before prod.

> RLS

Yes, amazing feature if you use it correctly.

I will give MySQL a couple of nods in its favor: well, three.

1. Clustering index. If you design your schema around this fact, range queries can be WAY faster. Dropbox does this, IIRC.

2. Generated Virtual Columns. Not sure why Postgres still doesn’t support this.

3. ON UPDATE CURRENT_TIMESTAMP. It’s so nice to have the DB automatically track update times for you, IMO.

replies(1): >>41883378 #

26. sgarland ◴[18 Oct 24 18:31 UTC] No.41882144{3}[source]▶

>>41879577 #

It’s more like a “not knowing how fast something should be” in the case of under-provisioning, and “not knowing or caring to look at metrics” for over-provisioning.

I once was examining some queries being generated via Prisma, and found it was using LIMIT/OFFSET for pagination. I pointed this out to the devs, who replied that the query times were acceptable for their SLOs. I guess if you don’t know that a simple SELECT can and should often be in the sub-msec range, you might not be concerned that it’s taking 100 msec.

The other is just the normalization of cloud expenditure. Every large org has some kind of pricing agreement with their provider, and so are rarely incentivized to cut costs, since they have a minimum spend.

replies(1): >>41883130 #

27. literalAardvark ◴[18 Oct 24 19:29 UTC] No.41882655{4}[source]▶

>>41880017 #

It is, but it also takes a very long time during which production data integrity can be severely affected.

28. literalAardvark ◴[18 Oct 24 19:31 UTC] No.41882680{4}[source]▶

>>41879466 #

Not true. Those scripts only cover problems caused by PostgreSQL itself, not, for example, a change in collations that will silently break your indexes, such as the one in Debian 12.

29. ttfkam ◴[18 Oct 24 20:23 UTC] No.41883130{4}[source]▶

>>41882144 #

I hear you.

When all you know is an ORM, you tend to treat SQL databases like dumb bit bucket add-ons to your app server. It's amazing how much potential performance and scalability are left on the floor because app developers can't shift their mindset when needed. Objects/structs cannot be assumed to map 1:1 with relations. What a world we'd live in if devs spent even 1/10 the effort examining their relational schema design that they spend arguing over whether a set, a list, or a queue is better for a given situation. It's like thoughts on Big-O stop cold at the database driver interface.

30. ttfkam ◴[18 Oct 24 20:50 UTC] No.41883378{4}[source]▶

>>41882079 #

I use transactional DDL all the time, even during development. It's nice to not have to fully reset your schema every time you test a migration file locally. With transactional DDL, you run the whole list, and if any fails, it rolls back to where you started. You look at the error, edit your migration, and try again. It really is a time saver. There is a peace of mind always knowing your schema is in a consistent state not unlike the peace that comes from a good set of unit tests.

1. Yep, I definitely miss clustering indexes in Postgres sometimes. I can sometimes fake it with covering indexes when all I want are an extra column or two along with the primary key or similar without seeking to the main table, but you're right about that MySQL/MariaDB win here.

2. The dynamic computed column is an easy workaround with immutable functions that take the record as a param.

        CREATE TABLE foo ( a int, b int, c int );
        CREATE FUNCTION d(entry foo) RETURNS int LANGUAGE sql IMMUTABLE AS $$
          SELECT foo.a + foo.b + foo.c;
        $$;

        SELECT a, b, c, d(foo) FROM foo;

It's not part of the table schema when doing a SELECT *, but it is just as efficient as a computed column in MySQL/MariaDB and only slightly more verbose.

3. ON UPDATE CURRENT_TIMESTAMP works in Postgres with a trigger function, which you can reuse if all your tables use the same name for your "last_modified" column (probably a good idea anyway). Not as convenient as the declarative syntax, but it's a fairly trivial workaround.

        CREATE OR REPLACE FUNCTION update_last_modified() RETURNS TRIGGER AS $$
          BEGIN
            NEW.last_modified = now();
            RETURN NEW;   
          END;
        $$ language 'plpgsql';

        CREATE TRIGGER foo_last_modified BEFORE UPDATE ON foo
        FOR EACH ROW EXECUTE PROCEDURE update_last_modified();

        CREATE TRIGGER bar_last_modified BEFORE UPDATE ON bar
        FOR EACH ROW EXECUTE PROCEDURE update_last_modified();

One function, many triggers. You also get to choose between "when transaction started" (now() or CURRENT_TIMESTAMP), "when statement started" (statement_timestamp()), or "right now" (clock_timestamp()).

I don't mind workarounds so much as functionality that simply cannot be replicated. For example I miss real temporal table support in Postgres like what you can find in MariaDB or MS SQL Server. The painful kludges for missing PIVOT support like in MS SQL Server is another one.

You never know how much you need deferred foreign key constraints until you don't have them anymore. Or a materialized view.

replies(1): >>41883889 #

31. ttfkam ◴[18 Oct 24 21:13 UTC] No.41883536{6}[source]▶

>>41881536 #

> the multi-table pipeline write pattern

        WITH new_order AS (
            INSERT INTO order (po_number, bill_to, ship_to)
                 VALUES ('ABCD1234', 42, 64)
              RETURNING order_id
        )
        INSERT INTO order_item (order_id, product_id, quantity)
             SELECT new_order.order_id, vals.product_id, vals.quantity
               FROM (VALUES (10, 1), (11, 5), (12, 3)) AS vals(product_id, quantity)
              CROSS JOIN new_order
        ;

Not super pretty, but it illustrates the point. A single statement that creates an order, gets its autogenerated id (bigint, uuid, whatever), and applies that id to the order items that follow. No network round trip necessary to get the order id before you add the items, which translates into a shorter duration for the transaction to remain open.

replies(1): >>41883777 #

32. ttfkam ◴[18 Oct 24 21:25 UTC] No.41883630{3}[source]▶

>>41878860 #

> the old central gate keeping DBA team model

I have mixed feelings about this. On the one hand I agree that ownership should be shared. On the other, app developers really don't consider their data structures as carefully in SQL as they do in-memory. It's odd. The right data structure matters more than a good algorithm since algorithms are easier to change. Once you settle on a list vs a set vs a queue, you're stuck once code is built around it.

The same is doubly true for the database schema. Lack of planning and knowledge of expected access patterns can turn an otherwise fast database to mud in no time flat. Once your data is in there, changing the schema is exponentially harder.

"I’m a huge proponent of designing your code around the data, rather than the other way around, and I think it’s one of the reasons git has been fairly successful… I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships." – Linus Torvalds (2006)

What is your database but a bunch of data structures and relationships? I get why the gatekeeping occurred. I don't agree with it, but I understand it. Far too many folks consider expertise in data stores to be optional as developers.

replies(2): >>41887467 #>>41888291 #

33. ttfkam ◴[18 Oct 24 21:31 UTC] No.41883681{6}[source]▶

>>41879419 #

> For a UUIDv4 PK

To be fair, this blows out any db that supports clustered indexes as well. Non-k-sortable primary keys are just a bad idea all around.

With UUIDv7, the WAL write amplification problem goes away just as the clustered index issues do.

replies(1): >>41887450 #

34. ttfkam ◴[18 Oct 24 21:35 UTC] No.41883719{5}[source]▶

>>41879447 #

> Easy enough: almost no one writes SQL queries by hand these days, not for querying the database nor for doing schema upgrades. It's all done by tools

Your experience does not match mine. Tools like ORMs make horrible schemas in my opinion that cater to the lowest common denominator of SQL engine functionality. This means leaving a lot of performance and scalability on the floor. In order to make the ORMs generate decent schema definitions, you need to know the underlying engine and therefore SQL. At that point, you might as well use SQL.

Ever try changing a column's data type from a table with hundreds of millions of rows with an ORM definition file? Hope you like downtime.

35. evanelias ◴[18 Oct 24 21:44 UTC] No.41883777{7}[source]▶

>>41883536 #

Thanks, that makes sense.

In this specific situation, the most common MySQL/MariaDB pattern would be to use LAST_INSERT_ID() in the second INSERT, assuming the order IDs are auto-increments. Or with UUIDs, simply generating the ID prior to the first INSERT, either on the application side or in a database-side session variable.

To avoid extra network calls, this could be wrapped in a stored proc, although a fair complaint is that MySQL doesn't support a ton of different programming langauges for procs/funcs like Postgres.

36. evanelias ◴[18 Oct 24 21:59 UTC] No.41883889{5}[source]▶

>>41883378 #

> It's nice to not have to fully reset your schema every time you test a migration file locally

In terms of dev flow, this is only a problem with imperative migration systems. Declarative schema management tools solve it by being able to transition any live database state into the desired state, which is expressed by a repo of CREATE statements.

If something fails, you fix the bad CREATE and run the tool again, and it effectively picks up where it left off. And well-designed declarative tools catch many problems pre-flight anyway by running a suite of linters, running the emitted SQL statements in a sandbox first, etc.

If the tool's diff returns clean, you know your schema is in the right state on the DB.

Ironically, lack of transactional DDL actually makes declarative schema management more straightforward in MySQL/MariaDB: you can't mix DDL and DML there anyway, so it's more natural to handle schema changes vs data migrations using different tools/pipelines.

37. ComputerGuru ◴[19 Oct 24 04:42 UTC] No.41885699{4}[source]▶

>>41879932 #

You still can’t use uuid as proper foreign keys with validation on mariaDB/MySQL though, right? It wasn’t possible with blobs at any rate.

replies(1): >>41890351 #

38. sgarland ◴[19 Oct 24 12:35 UTC] No.41887450{7}[source]▶

>>41883681 #

Agreed. My preferred PK in descending order is: natural key if it makes sense and would speed up queries, integer of an appropriate size, UUIDv7. I only rank it below integers because at best, they’re 16 bytes, and even a BIGINT is only 8 bytes.

replies(1): >>41905388 #

39. sgarland ◴[19 Oct 24 12:44 UTC] No.41887467{4}[source]▶

>>41883630 #

100% on all points. I’m a fan of gatekeeping things that are hard to get right, and hard to undo. If you can prove that you know what you’re doing, by all means, own your stuff. But until then, yes, I’d like to watch over your shoulder, and hopefully teach you how to do it right.

The fact that DBs are data structures but are ignored has always irritated me. If I pushed a PR using lists for everything regardless of their practicality, I’d be rightly told to try again. But push a PR with a schema using similar suboptimal choices, and no one blinks an eye.

40. dalyons ◴[19 Oct 24 15:16 UTC] No.41888291{4}[source]▶

>>41883630 #

I agree but also I think I could be clearer about the key advantage of the distributed many dbs model… it’s that I don’t have to care if people are good at anything that you said :). If teams want to do dumb things with their schema, fine, that’s on them and they’ll have to deal with the consequences. If it matters, they’ll learn and get better. If they want to buy their way out of it with their latency and infra budget instead, that’s honestly fine too.

With many smaller databases owned separately, the blast radius of bad db decisions is small/local.

41. samlambert ◴[19 Oct 24 19:52 UTC] No.41890264[source]▶

>>41878540 #

this simply can’t be true. there are endless stories of people moving from postgres to mysql because of reliability and operational issues. postgres has only had a mature replication solution for 7 years. mysql had it in 1999.

42. evanelias ◴[19 Oct 24 20:05 UTC] No.41890351{5}[source]▶

>>41885699 #

This has always been possible, for example using the BINARY(16) column type if you want to be efficient. Or in MariaDB 10.7+ you can now use the dedicated UUID column type, which is equivalent to BINARY(16) under the hood, but provides a human-readable hex value when queried.

UUIDs are fixed-length. Blobs are not the appropriate type for that.

replies(1): >>41921933 #

43. ttfkam ◴[21 Oct 24 15:46 UTC] No.41905388{8}[source]▶

>>41887450 #

Contrary to intuition, bigint ends up only 25% smaller on disk than UUIDv7 and no improvement in speed. Honestly in 2024, 32-bit ints are generally a poor fit. Either the table is small enough to be considered a lookup table (16-bit int) or could grow to unknown bounds given enough time (64-bit int). Sequential ids are guessable (which could be a security issue), cannot be generated on the client, and can hinder sharding now that Postgres supports bidirectional replication (multiple writers).

https://ardentperf.com/2024/02/03/uuid-benchmark-war/

My PK preference in descending order: natural key if it makes sense and would speed up queries, 16-bit for unambiguous lookup tables with a hard finite limit of values, UUIDv7, 64-bit sequence, UUIDv4, and then finally 32-bit sequence.

UUIDv7 for performance, reasonable non-guessability, client-generation, and potential for sharding. 64-bit for cases that really need a sequence. UUIDv4 when randomness is required for security. 32-bit when there's a hard finite limit that can guaranteed to be below 2 billion even given the most optimistic of projections.

If the database is small, it doesn't matter. If it's large, it is 99.999% unlikely the primary key will be anything more than a rounding error in storage, access time, or cost. No one ever said, "We would have made payroll if only we had used 32-bit primary keys."

To really save space, extract the timestamp in the UUIDv7 and use an expression index to use as the creation date. Then you're not "wasting" any storage space. Personally I don't think the storage optimization is necessary. Premature optimization being evil and all that.

44. ComputerGuru ◴[23 Oct 24 04:51 UTC] No.41921933{6}[source]▶

>>41890351 #

Sorry, I was mistaken and I already do have that set up and working using binary. There’s something else I’m trying but failing to remember that isn’t possible with binary keys.

↑