←back to thread

628 points kiyanwang | 10 comments | | HN request time: 0.225s | source | bottom
Show context
bob1029 ◴[] No.43630646[source]
Not guessing is perhaps the most important thing to the business.

I developed a lot of my problem solving skills in semiconductor manufacturing where the cost of a bad assumption tends to be astronomical. You need to be able to determine exactly what the root cause is 100% of the time or everything goes to hell really fast. If there isn't a way to figure out the root cause, you now have 2 tickets to resolve.

I'll throw an entire contraption away the moment I determine it has accumulated some opacity that antagonizes root cause analysis. This is why I aggressively avoid use of non-vanilla technology stacks. You can certainly chase the rabbit over the fence into the 3rd party's GitHub repo, but I find the experience gets quite psychedelic as you transition between wildly varying project styles, motivations and scopes.

Being deeply correct nearly all of the time is probably the fastest way to build a reputation. The curve can be exponential over time with the range being the value of the problem you are entrusted with.

replies(5): >>43631055 #>>43631842 #>>43632734 #>>43637040 #>>43638701 #
Taek ◴[] No.43631055[source]
I always get a lot of pushback for avoiding frameworks and libraries, and rolling most things by hand.

But, most frameworks and libraries aren't built to be audit-grade robust, don't have enterprise level compatibility promises, can't guarantee that there won't be suprise performance impacts for arbitrary use cases, etc.

Sometimes, a third party library (like sql-lite) makes the cut. But frameworks and libraries that reach the bar of "this will give me fewer complications than avoiding the dependency" are few and far between.

replies(8): >>43631189 #>>43631275 #>>43631326 #>>43632119 #>>43632384 #>>43635012 #>>43635674 #>>43644940 #
pc86 ◴[] No.43632119[source]
This is smart if you work for a company that actually needs this level of robustness. The problem is that most don't, and a lot of people who work for these companies wish they were working someone "better"/"more important," so they pretend they actually do need this level of performance.

The guy like you on a mission critical team at a cutting edge company is a godsend and will be a big part of why the project/company succeeds. The guy who wants to build his own ORM for his no-name company's CRUD app is wasting everyone's time.

replies(4): >>43632378 #>>43632716 #>>43632815 #>>43658889 #
9rx ◴[] No.43632815[source]
> The guy who wants to build his own ORM for his no-name company's CRUD app is wasting everyone's time.

I once unfortunately joined a project where an off-the-shelf ORM had been selected, but when development was well into the deep edge cases started to reveal serious design flaws in the ORM library. A guy wanting (perhaps not a in a joyful sense, but more not seeing any other choice) to build his own ORM that was mostly API-compatible was what saved the project.

This was a long time ago. The state of ORM libraries is probably a lot better today. But the advice of ensuring that a library is SQLite-grade before committing to it does rings true even for simple CRUD ORMs. Perhaps especially so.

replies(7): >>43633292 #>>43634520 #>>43635035 #>>43637001 #>>43638665 #>>43638768 #>>43658894 #
1. awkward ◴[] No.43635035[source]
Not to get too off topic, but ORMs are bags of design flaws. They are definitional technical debt, the kind that gets you from proof of concept but needs to be reworked once you get to your target scale.

There are a large number of fundamental impedance mismatches between relational data and object based data. Any ORM can fix some of them at the cost of ignoring others, but the fundamental character of ORMs is such that taking an opinionated line on tough tradeoffs is as good as you can hope for.

This is why ORM guy is wasting everyone's time - his problem is almost definitely not going to have a unique or even valuable perspective on all of those tradeoffs.

replies(2): >>43635196 #>>43637739 #
2. 9rx ◴[] No.43635196[source]
In realistic practice there is no escaping it, though. Even if you maintain relations throughout the majority of your application, you are almost certainly still going to need to call some kind of third-party API or networked service that requires mapping between relations and objects. Especially if networked services are involved as they nearly always eschew relations in favour of objects to avoid the typical n+1 problems.

Should your application map objects and relations at all isn't usually a question you get to ask unless it is doesn't do much or lives on its own private island. Should you do it yourself or lean on a toolkit to help is the question that you have to contend with.

replies(2): >>43635529 #>>43635555 #
3. awkward ◴[] No.43635529[source]
Oh for sure, they fill huge and consistent gap between most databases and most programming languages. The issue is that they are fundamentally a compromise. That means it's not damning to hear that issues crept in with scale and complexity. It's also rarely practical to take on the project of making a slightly different set of compromises from first principles.
replies(1): >>43635731 #
4. seadan83 ◴[] No.43635555[source]
A DB query without ORM is effectively a service. This hides relations in the DB layer, rendering moot the need to model these relations in object oriented code. Thus, eschewing the ORM completely moots the question of whether to map objects and relations. I'd suggest if you are ever asking that question, you are already screwed.
replies(1): >>43635852 #
5. 9rx ◴[] No.43635731{3}[source]
> The issue is that they are fundamentally a compromise.

Which is no doubt why most newer applications I see these days have trended towards carrying relations as far as they can go, only mapping with objects at the points where it is absolutely necessary.

> It's also rarely practical to take on the project of making a slightly different set of compromise

I suppose that is the other benefit of delaying mapping until necessary. What needs to be mapped will be more limited in scope and can be identified as such. You don't have to build a huge framework that can handle all conceivable cases. You can reduce it to only what you need, which is usually not going to be much, and can determine what tradeoffs best suit in that. In this type of situation it is likely that using a ORM library is going to be a bigger waste of time, honestly.

6. 9rx ◴[] No.43635852{3}[source]
Querying and ORM are very different concepts. Object-relation mapping is concerned with, as it literally asserts, mapping between relations (or, more likely in practice, tables – but they are similar enough for the sake of this discussion) and objects. Maybe you are confusing ORM with the active record pattern (popularized by ActiveRecord, the library) which combines query building and ORM into some kind of unified concept? ActiveRecord, the library, confusingly called itself ORM when it was released which may be the source of that.
replies(2): >>43638417 #>>43640138 #
7. bob1029 ◴[] No.43637739[source]
> There are a large number of fundamental impedance mismatches between relational data and object based data

My experience tells me that the largest among these impedance mismatches is the inability for OOP languages to express circular dependencies without resorting to messy hackarounds. Developers often fail to realize how far they are into the dragon's den until they need to start serializing their object graphs.

https://github.com/dotnet/runtime/issues/29900

8. skydhash ◴[] No.43638417{4}[source]
Was confused by that too (ORM and Active Records), but I spend some time learning about DDD which leads me into enterprise architecture and that's when I all the design pattern for interacting with data. Most web frameworks only have Query Builder and Active Records.
9. seadan83 ◴[] No.43640138{4}[source]
First, for definitions, I'd suggest we use wikipedia for ORM [1] and also Active Record Pattern [2].

I believe Active Record is a more specific implementation of something that is ORM-like. We can stop speaking of Active Record since my point holds for the more generic ORM, and therefore holds for Active Record as well.

To clarify my point, there is a fundamental impedance mismatch between object mapping of data vs relational database mapping of data. One implication of this is you cannot use database as a service. Interactions with database must instead be gated behind the ORM and the ORM controls the database interaction.

I'll note that database as a service is very powerful. For example, when there is an API contract exposing a value that is powered by some raw-dog SQL, when the database changes, anything using the API does not need to change. Only the SQL changes. In contrast, when an ORM exposes an object, an attribute might sometimes be loaded, sometimes not. A change to load or not load that attribute ripples through everything that uses that object. That type of change in ORM-land is the stuff of either N+1 problems, or Null-Pointers.

To back up a bit, let me re-iterate a bit about the impedance mismatch. Wikipedia speaks of this [1]: "By contrast, relational databases, such as SQL, group scalars into tuples, which are then enumerated in tables. Tuples and objects have some general similarity... They have many differences, though"

To drive the point home - in other words, you can't do everything in object world that you can do in a database 1:1. A consequence of this is that the ORM requires the application to view the database as a persistence store (AKA: data-store, AKA: object store, AKA: persistence layer). The ORM controls the interaction with database, you can't just use database as a data service.

I believe this point is illustrated most easily from queries.

To illustrate, let's pull some query code [3] from Java's Hibernate, a prototypical ORM.

```

public Movie getMovie(Long movieId) {

    EntityManager em = getEntityManager();

    Movie movie = em.find(Movie.class, new Long(movieId));

    em.detach(movie);

    return movie;
}

```

So, getting a release year might look like this:

```

int movieId = 123;

Movie m = orm.getMovie(movieId);

return m.getReleaseYear();

```

In contrast, if we put some raw-dogged SQL behind a method, we get this code:

```

int movieId = 123;

return movieDao.getMovieReleaseYearByMovieId(movieId);

```

Now, let's illustrate. To do this, let us look at the example of finding the release year of the highest grossing movie. As a service, that looks like this:

```

return dao.findReleaseYearOfHighestGrossingMovie();

```

In contrast, as an ORM, you might have to load all Movies and then iterate. Maybe the ORM might have some magic sugar to get a 'min/max' value though. We can go on though, let's say we want to get the directors of the top 10 grossing movies. An ORM will almost certainly require you to load all movies and then iterate, or start creating some objects specifically to represent that data. In all cases, an ORM presents the contract is an an object rather than as an API call (AKA, a service).

For the update case, ORMs often do pretty well. ORMs can get into trouble with the impedance mismatch when doing things like trying to update joined entities. For example, "update all actors in movie X". Further, ORM (and objects) creates issues of stale/warm caches, nullity, mutability, performance, and more... What is worse, all of this is intrinsic, relational data and objects are intrinsically different.

[1] https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapp...

[2] https://en.wikipedia.org/wiki/Active_record_pattern

[3] https://www.baeldung.com/hibernate-entitymanager

replies(1): >>43641076 #
10. 9rx ◴[] No.43641076{5}[source]
> To illustrate, let's pull some query code [3] from Java's Hibernate, a prototypical ORM.

ORM and entity manager – which, in turn, is a query builder combined with a few other features. Your code is really focused on the latter. While the entity manager approach is not the same as active record, that is true, the bounds between query building and ORM, I think, are even clearer. In fact, your code makes that separation quite explicit. I can at least understand how ORM and query building get confused under active record.

> We can stop speaking of Active Record

While I agree in theory, since we are talking about ORM only, if we go by Wikipedia we cannot as is ends up confusing active record and ORM as being one and the same. That is a mistake. But as my teachers, and presumably yours too, told me in school: Don't trust everything you read on Wikipedia.

But we don't need to go to Wikipedia here anyway. Refreshingly, ORM literally tells what it is right in its name. All you need to do is spell it out: Object-Relation Mapping.