←back to thread

153 points xlinux | 4 comments | | HN request time: 0s | source
Show context
lovasoa ◴[] No.42197498[source]
The topic of huge queries on tiny databases makes me think of this recent discussion on the SQLite forum: https://sqlite.org/forum/forumpost/0d18320369

Someone had an issue because SQLite failed to optimize the following query

    select * from t where x = 'x' or '' = 'x'
Someone said that SQLite could not optimize out the "or '' = 'x'" because it would be too expensive to compute. Which is obviously true only for huge queries on tiny datasets.
replies(3): >>42197594 #>>42197614 #>>42198312 #
recursive ◴[] No.42197614[source]
It's not obviously true at all. Optimizing out `'' = 'x'` can be done for a fixed cost regardless of record count.
replies(1): >>42198538 #
1. lovasoa ◴[] No.42198538[source]
Optimizing out static expressions can be done in linear time at best. So if the number of clauses in WHERE is huge and the size of the underlying table is tiny (such as in the examples shown in the article we are commenting on), it will be better not to run the optimization.

But of course, in normal life, outside of the world of people having fun with Homomorphisms, queries are much smaller than databases.

replies(1): >>42198550 #
2. recursive ◴[] No.42198550[source]
Parsing the expression in the first place is already linear time.
replies(1): >>42203768 #
3. thaumasiotes ◴[] No.42203768[source]
True, but that doesn't mean doing additional work during the parse is free. Optimizing out static expressions will take additional time, and in general that additional time will be linear in the query size.
replies(1): >>42206213 #
4. recursive ◴[] No.42206213{3}[source]
My argument is that, on average, it will more than pay for itself.

The only losing case, if there are any measurable ones, is where you have long queries and short data. I'd call that a case of "doing it wrong". Wrong tool for the job.