Most active commenters
  • osigurdson(6)

←back to thread

570 points davidgu | 15 comments | | HN request time: 0.888s | source | bottom
Show context
osigurdson ◴[] No.44527817[source]
I like this article. Lots of comments are stating that they are "using it wrong" and I'm sure they are. However, it does help to contrast the much more common, "use Postgres for everything" type sentiment. It is pretty hard to use Postgres wrong for relational things in the sense that everyone knows about indexes and so on. But using something like L/N comes with a separate learning curve anyway - evidenced in this case by someone having to read comments in the Postgres source code itself. Then if it turns out that it cannot work for your situation it may be very hard to back away from as you may have tightly integrated it with your normal Postgres stuff.

I've landed on Postgres/ClickHouse/NATS since together they handle nearly any conceivable workload managing relational, columnar, messaging/streaming very well. It is also not painful at all to use as it is lightweight and fast/easy to spin up in a simple docker compose. Postgres is of course the core and you don't always need all three but compliment each other very well imo. This has been my "go to" for a while.

replies(12): >>44528211 #>>44528216 #>>44529511 #>>44529632 #>>44529640 #>>44529854 #>>44530773 #>>44531235 #>>44531722 #>>44532418 #>>44532993 #>>44534858 #
fathomdeez ◴[] No.44528216[source]
This kind of issue always comes up when people put business logic inside the database. Databases are for data. The data goes in and the data goes out, but the data does not get to decide what happens next based on itself. That's what application code is for.
replies(12): >>44528249 #>>44528293 #>>44528307 #>>44528582 #>>44528918 #>>44529077 #>>44529583 #>>44530054 #>>44530782 #>>44530978 #>>44532428 #>>44533144 #
bevr1337 ◴[] No.44528249[source]
> the data does not get to decide what happens next based on itself.

Then why bother with a relational database? Relations and schemas are business logic, and I'll take all the data integrity I can get.

replies(2): >>44528275 #>>44529035 #
1. Jailbird ◴[] No.44528275[source]
I've seen both of these philosophies. I liken them to religions, the believers are devout. Code is King vs the DB is King.

I'm personally Code is King, and I have my reasons (like everyone else)

replies(6): >>44528834 #>>44530423 #>>44531011 #>>44531578 #>>44532486 #>>44539388 #
2. IgorPartola ◴[] No.44528834[source]
I am mostly on the side of business logic should live in applications and relationships between data types are not business logic so much as just the layout of the data. But I typically access data via an ORM and they typically don’t have support for triggers and stored procedures. If they did, I would certainly use it because projects I work on might have multiple people writing application code but everyone uses a single set of database models. This would mean that critical constraints on the shape of the data could be defined and respected at all times vs some developer on my team forgetting to include some critical check in their data update routine.
replies(2): >>44532517 #>>44539523 #
3. ako ◴[] No.44530423[source]
It’s really not about code is better or database it better, it’s mostly about locality: if you want to update thousands of records, you can’t pull those records into a separate process, update them there and then write back. So you put your code next to the data in the database. Stored procedures are just code deployed to a database container…
replies(1): >>44531243 #
4. whstl ◴[] No.44531011[source]
And both of those philosophies will lead to bad engineering.

There are things that work better, are safer and simpler to do on the database, and things that work better, are safer and simpler in code. And those things might change depending on context, technology, requirements, size of project, experience of contributors, etc.

Forcing round pegs into square holes will always lead to brittle code and brittle products, often for more cost (mental and financial!) than actually using each tool correctly.

5. 0xFEE1DEAD ◴[] No.44531243[source]
Sure you can, I've done it plenty of times. I'm genuinely curious why you think it's not possible.

The only reasons I can think of:

- you're rewriting a legacy system and migrate parts incrementally

- data compliance

- you're running a dangerous database setup

I try my best to avoid putting any business logic inside databases and see stored procedures only as a temporary solution.

replies(2): >>44532398 #>>44539437 #
6. dotancohen ◴[] No.44531578[source]
I believe that both code and data are kings, under different realms. Code is king of the "what we're doing today" realm. Data is king of the "what's possible tomorrow" realm.

Both have their place in business.

7. bevr1337 ◴[] No.44532398{3}[source]
Although I'm partial to a SPROC, I do not deploy them because I understand my colleagues might throw me from a window. But without going full tilt DB-as-the-application,

The DB can make much stronger guarantees about transactions and updates the closer that logic happens to itself. In the world of cloud computing, this can be a cost savings for ingress/egress too.

replies(1): >>44539483 #
8. sgarland ◴[] No.44532486[source]
Every company I’ve been at that relied on application code to handle referential integrity had orphaned rows, and incidents related to data errors or the absurd pipelines they had built to recreate what FK constraints and triggers already do.

RDBMS are extremely well-tested pieces of software that do their job incredibly well. To think that you could do better, or even equally as well, is hubris. If you want to trade those guarantees for “velocity” go right ahead, but you also need to take into account the inevitable incidents and recoveries that will occur.

9. sgarland ◴[] No.44532517[source]
Every ORM I’m aware of allows you to drop down to raw SQL. Write your stored procedure, store it in VCS, add it as a migration, and then call it. If you want to make it friendlier, wrap the call in a function in your language so you can add helpers, better error handling, etc.
replies(1): >>44537448 #
10. IgorPartola ◴[] No.44537448{3}[source]
What I would prefer is integration at the model definition level. For example let’s say that I have a Customer model and an Order model. I don’t always want to pull in the customer fields when listing orders. Most ORMs would allow me to create a join and specify the field from Customer I want when fetching Orders but those joins add up quickly. I could denormalize the data and put things like the customer name and email onto each order but if the customer changes either value now the application code has to remember to update it. And yes I could put that in the model’s save() method but that is fragile too because what if someone else does run code that updates stuff at the raw SQL level and doesn’t include these updates.

Now if I could specify that I want Order.customer_name to come from a specific other model and be updated automatically the ORM could automatically create a trigger to update that field when the customer table is updated.

Obviously this is a very simplistic example but there are many more, including versioning and soft deletes that could be incredibly useful. But the key is that the ORM has to generate the code for the triggers and stored procedures. Doing that manually is possible now but (a) uses a different language even than regular SQL which not everyone is familiar with, and (b) there is no type checking for what you are doing. The ORM model definitions are the main source of truth about the shape of your database, so I want to use them as such.

replies(1): >>44539577 #
11. osigurdson ◴[] No.44539388[source]
Maybe not DB, but getting data from wherever it may be to the registers in the computer is certainly is the King of Kings.
12. osigurdson ◴[] No.44539437{3}[source]
Its possible but of course slow because of https://gist.github.com/jboner/2841832. Data locality matters a lot. Moving data around the network when it doesn't really need to be moved is heresy (unless performance doesn't matter, then who cares). Remember the computer doesn't care about your religion which says only this can do this and only that can do that.
13. osigurdson ◴[] No.44539483{4}[source]
>> deploy them because I understand my colleagues might throw me from a window

Maybe throw your colleagues out the window instead if they don't know what they are talking about. I'm not anti/pro SPROC at all, but I am anti anti-reality. People that don't understand the vast differences in latencies between in process and out of process work should not exist in the industry.

14. osigurdson ◴[] No.44539523[source]
>> I am mostly on the side of...

Generally customers don't care about religious views. Make understanding the actual machine and associated latencies your religion instead. The reason to write a stored proc or do some processing in the database is entirely about data locality, not to keep the drooling masses from messing things up. A library is fine for that.

15. osigurdson ◴[] No.44539577{4}[source]
>> I don’t always want to pull in the customer fields when listing orders

Data locality is king. Everything comes down to physical things such as blocks on the SSD, network interconnect, RAM, L3, L2, L1 cache and registers. Are those customer fields in the same page as whatever else you need? If so, most of the work is already done. Yes, you can save some network bandwidth transferring things that aren't needed but does it matter? It might but it might not. The key is to know what matters and reason about things from the perspective of the machines actually doing the work.