←back to thread

60 points Bogdanp | 1 comments | | HN request time: 0.209s | source
Show context
aidos ◴[] No.44609465[source]
I’ve done a lot of interviewing and I’ve discovered that many devs (even experienced ones) don’t understand the difference between indexes and foreign keys.

My assumption is that people have used orms that automatically add the index for you when you create a relationship so they just conflate them all. Often they’ll say that a foreign key is needed to improve the performance and when you dig into it, their mental model is all wrong. The sense they have is that the other table gets some sort of relationship array structure to make lookups fast.

It’s an interesting phenomenon of the abstraction.

Don’t get me wrong, I love sqlalchemy and alembic but probably because I understand what’s happening underneath so I know the right way to hold it so things are efficient and migrations are safe.

replies(5): >>44609923 #>>44610030 #>>44610036 #>>44611356 #>>44611914 #
bevr1337 ◴[] No.44610030[source]
> their mental model is all wrong.

Is it? In Postgres, all FK references must be to a column with a PK or unique constraint or part of another index. Additionally, Postgres and Maria (maybe all SQL?) automatically create indexes for PKs and unique constraints. There's a high likelihood that a foreign key is already indexed _in the other table_.

Generally, I agree with your statement. Adding a FK won't magically improve performance or create useful indices. But, the presence of a FK or refactoring to support a FK does (tangentially) point back to that index.

replies(3): >>44610282 #>>44610714 #>>44610850 #
aidos ◴[] No.44610850[source]
I wasn’t totally clear on my original statement. As you point out, the referenced columns in the referenced table need to have a unique constraint and that’s done with a unique index. My understanding is that this ensures there’s no ambiguity as to which row is referenced and allows for efficient enforcement of the FK constraint.

Django automatically creates an index on the referencing table to ensure that joins are fast. The fact that you have the relationship in the ORM means that’s how you’re likely to access the data so it makes perfect sense.

The mental model mismatch I’ve seen is that people appear to think of the relationship as being on the parent object “pointing” at the child table.

replies(1): >>44612391 #
1. bevr1337 ◴[] No.44612391[source]
I'll admit my experience in Django is only migrating customers off Django. Thanks for adding some interesting details about that ecosystem