Software development topics I've changed my mind on

1. mindwok ◴[06 Feb 25 00:32 UTC] No.42957437[source]▶

Can someone explain the ORM thing to me? I’ve been a developer for 8 years but never really worked on an app that was really database dependent. ORMs for me have always been convenient, and the performance has been fine. I understand there’s obvious tradeoffs I’m making, and in some cases full control is necessary, but I’ve never seen it happen. What level of complexity does an app need to get to before an ORM becomes a nightmare?

replies(2): >>42960117 #>>42960560 #

2. mrkeen ◴[06 Feb 25 07:54 UTC] No.42960117[source]▶

>>42957437 (TP) #

Personal pet peeves:

* They hide the queries. When your DB or cloud service gives you a printout of your 10 slowest queries, you then have to figure out what object code that relates to. And then is there even a way to fix it, or are you stuck with the ORM?

* LINQ-specific: Love the tech, but it's unclear whether my .wheres() are being sent upstream properly, or if I'm downloading the whole database and filtering it in memory.

* Another LINQ one: we wanted to do "INSERT IF NOT EXISTS" but could not.

* Back in Java land, magic like that tends to be incompatible with basic hygiene like consting all your class fields. Frameworks like being able to construct a Foo in an invalid state, and then perform a bunch of mutations until it's in a good state.

* They make it near impossible to reason about transaction states. If I call two methods under the same open db context, what side-effects can leak out? If I try to do an UPDATE ... SET x = x + 1, that will always increment correctly in SQL. But if read x from an ORM object and write back x + 1, that looks like I'm just writing a constant, right?

* Extra magic: if you've read a class from the db, pass it around, and then modify a field in that class, will that perform a db update: now? later? never?

But just in general, I want to look at the data, play with queries in a repl environment until they look right, and then use directly in the code without needing to translate from high-level&declarative down into imperative loops, sets and gets.

replies(2): >>42963101 #>>42965007 #

3. globular-toast ◴[06 Feb 25 09:15 UTC] No.42960560[source]▶

>>42957437 (TP) #

First we have to make sure we're talking about the same thing. There are "active record" ORMs (like Django, Ruby on Rails, etc.), there are "data mapper" ORMs (Hibernate, SQLAlchemy etc.), then there are things like LINQ which are not ORMs at all but merely SQL generators (but you could build an ORM with it if you want).

The arguments against them, I think, are most strong for active record and much less strong for data mapper.

The problem really is complexity, aka coupling. An active record ORM naturally pervades an entire codebase that uses it. People pass around these "objects" that are really just thinly-veiled database rows. But that's all they are. They are at exactly the same abstraction level as the relational database itself, they just look like objects. But in fact they are filled with footguns because accessing those attributes could trigger database requests.

So you'll see business-level code written that has to "know" about the ORM and "know" about N+1 queries and therefore essentially "know" about SQL and the underlying relationships (or, conversely, data access layers that have to "know" about the business logic, e.g. "I know this logic needs to access this bit so I'll prefetch it"). So you're not really gaining anything. These ORMs are the complete opposite of a good software architecture that gives you flexibility and ability to reason about components in isolation.

A good data mapper ORM at least lets you map data from relational tables to real objects. That way you are able to build a new abstraction layer upon which to write business logic etc. A programmer writing those business rules should be able to fully write and test logic with no knowledge of the ORM at all. But in active record projects you'll find each and every developer has to have the full stack in their heads at all times.

I would be interested to know if there are strong reasons to avoid data mapper ORMs too.

4. wimdetroyer ◴[06 Feb 25 15:11 UTC] No.42963101[source]▶

>>42960117 #

>When your DB or cloud service gives you a printout of your 10 slowest queries, you then have to figure out what object code that relates to

Not if you have proper telemetry set up... Tooling like instana was extremely useful for me to diagnose exactly where SQL statements caused issues

>Extra magic: if you've read a class from the db, pass it around, and then modify a field in that class, will that perform a db update: now? later? never?

For hibernate if you understand the concepts of:

- application level repeatable reads

- it's dirty checking mechanism

- when the session is flushed / entity lifecycle

That 'magic' isn't magic anymore. But every abstraction is leaky (even SQL)

5. nodamage ◴[06 Feb 25 18:21 UTC] No.42965007[source]▶

>>42960117 #

> If I try to do an UPDATE ... SET x = x + 1, that will always increment correctly in SQL. But if read x from an ORM object and write back x + 1, that looks like I'm just writing a constant, right?

This is not specific to ORMs... you can run into the same problem without one.

> Extra magic: if you've read a class from the db, pass it around, and then modify a field in that class, will that perform a db update: now? later? never?

In every ORM I've used you have specific control over when this happens.