←back to thread

SQL Design Patterns (2010)

(vadimtropashko.wordpress.com)
149 points mci | 3 comments | | HN request time: 0.606s | source
Show context
potatoproduct ◴[] No.45077193[source]
Not ashamed to admit that I never really thought about the distinct operator 'being redundant' as its essentially just a group by.
replies(3): >>45077523 #>>45078843 #>>45079717 #
1. morkalork ◴[] No.45077523[source]
distinct has always felt like a query smell to me. Too many junior analysts abusing it because they don't know the schema well and are over-joining entities
replies(2): >>45078136 #>>45078258 #
2. dspillett ◴[] No.45078136[source]
DISTINCT is often a smell at the head (or middle) of a complex query as you are throwing away processed information, sometimes a lot of it, late in the game. Much better to filter it out earlier and not process it further, where possible. Filtering earlier, as well as reducing waste processing time (and probably memory use), increases the chance of the query planner being able to use an index for the filter which could greatly decrease the IO cost of your query.
3. ryanjshaw ◴[] No.45078258[source]
Sometimes the number of joins is fine but they don’t understand the data properly and should be spending more time understanding why multiple rows are being returned when they expect one (eg they need to filter on an additional field).

I wish SQL had a strict mode syntax that forces you to use something like `select one` (like LINQ’s Single()) or `select many` to catch these kinds of bugs.