Postgres LISTEN/NOTIFY does not scale

(www.recall.ai)

570 points davidgu | 2 comments | 07 Jul 25 14:05 UTC | HN request time: 0.001s | source

Show context

osigurdson ◴[11 Jul 25 02:13 UTC] No.44527817[source]▶

I like this article. Lots of comments are stating that they are "using it wrong" and I'm sure they are. However, it does help to contrast the much more common, "use Postgres for everything" type sentiment. It is pretty hard to use Postgres wrong for relational things in the sense that everyone knows about indexes and so on. But using something like L/N comes with a separate learning curve anyway - evidenced in this case by someone having to read comments in the Postgres source code itself. Then if it turns out that it cannot work for your situation it may be very hard to back away from as you may have tightly integrated it with your normal Postgres stuff.

I've landed on Postgres/ClickHouse/NATS since together they handle nearly any conceivable workload managing relational, columnar, messaging/streaming very well. It is also not painful at all to use as it is lightweight and fast/easy to spin up in a simple docker compose. Postgres is of course the core and you don't always need all three but compliment each other very well imo. This has been my "go to" for a while.

replies(12): >>44528211 #>>44528216 #>>44529511 #>>44529632 #>>44529640 #>>44529854 #>>44530773 #>>44531235 #>>44531722 #>>44532418 #>>44532993 #>>44534858 #

fathomdeez ◴[11 Jul 25 03:47 UTC] No.44528216[source]▶

>>44527817 #

This kind of issue always comes up when people put business logic inside the database. Databases are for data. The data goes in and the data goes out, but the data does not get to decide what happens next based on itself. That's what application code is for.

replies(12): >>44528249 #>>44528293 #>>44528307 #>>44528582 #>>44528918 #>>44529077 #>>44529583 #>>44530054 #>>44530782 #>>44530978 #>>44532428 #>>44533144 #

bevr1337 ◴[11 Jul 25 03:52 UTC] No.44528249[source]▶

>>44528216 #

> the data does not get to decide what happens next based on itself.

Then why bother with a relational database? Relations and schemas are business logic, and I'll take all the data integrity I can get.

replies(2): >>44528275 #>>44529035 #

Jailbird ◴[11 Jul 25 03:59 UTC] No.44528275[source]▶

>>44528249 #

I've seen both of these philosophies. I liken them to religions, the believers are devout. Code is King vs the DB is King.

I'm personally Code is King, and I have my reasons (like everyone else)

replies(6): >>44528834 #>>44530423 #>>44531011 #>>44531578 #>>44532486 #>>44539388 #

IgorPartola ◴[11 Jul 25 06:05 UTC] No.44528834[source]▶

>>44528275 #

I am mostly on the side of business logic should live in applications and relationships between data types are not business logic so much as just the layout of the data. But I typically access data via an ORM and they typically don’t have support for triggers and stored procedures. If they did, I would certainly use it because projects I work on might have multiple people writing application code but everyone uses a single set of database models. This would mean that critical constraints on the shape of the data could be defined and respected at all times vs some developer on my team forgetting to include some critical check in their data update routine.

replies(2): >>44532517 #>>44539523 #

sgarland ◴[11 Jul 25 14:25 UTC] No.44532517[source]▶

>>44528834 #

Every ORM I’m aware of allows you to drop down to raw SQL. Write your stored procedure, store it in VCS, add it as a migration, and then call it. If you want to make it friendlier, wrap the call in a function in your language so you can add helpers, better error handling, etc.

replies(1): >>44537448 #

1. IgorPartola ◴[11 Jul 25 22:33 UTC] No.44537448[source]▶

>>44532517 #

What I would prefer is integration at the model definition level. For example let’s say that I have a Customer model and an Order model. I don’t always want to pull in the customer fields when listing orders. Most ORMs would allow me to create a join and specify the field from Customer I want when fetching Orders but those joins add up quickly. I could denormalize the data and put things like the customer name and email onto each order but if the customer changes either value now the application code has to remember to update it. And yes I could put that in the model’s save() method but that is fragile too because what if someone else does run code that updates stuff at the raw SQL level and doesn’t include these updates.

Now if I could specify that I want Order.customer_name to come from a specific other model and be updated automatically the ORM could automatically create a trigger to update that field when the customer table is updated.

Obviously this is a very simplistic example but there are many more, including versioning and soft deletes that could be incredibly useful. But the key is that the ORM has to generate the code for the triggers and stored procedures. Doing that manually is possible now but (a) uses a different language even than regular SQL which not everyone is familiar with, and (b) there is no type checking for what you are doing. The ORM model definitions are the main source of truth about the shape of your database, so I want to use them as such.

replies(1): >>44539577 #

2. osigurdson ◴[12 Jul 25 05:32 UTC] No.44539577[source]▶

>>44537448 (TP) #

>> I don’t always want to pull in the customer fields when listing orders

Data locality is king. Everything comes down to physical things such as blocks on the SSD, network interconnect, RAM, L3, L2, L1 cache and registers. Are those customer fields in the same page as whatever else you need? If so, most of the work is already done. Yes, you can save some network bandwidth transferring things that aren't needed but does it matter? It might but it might not. The key is to know what matters and reason about things from the perspective of the machines actually doing the work.

↑