←back to thread

128 points xlinux | 8 comments | | HN request time: 0.568s | source | bottom
1. lovasoa ◴[] No.42197498[source]
The topic of huge queries on tiny databases makes me think of this recent discussion on the SQLite forum: https://sqlite.org/forum/forumpost/0d18320369

Someone had an issue because SQLite failed to optimize the following query

    select * from t where x = 'x' or '' = 'x'
Someone said that SQLite could not optimize out the "or '' = 'x'" because it would be too expensive to compute. Which is obviously true only for huge queries on tiny datasets.
replies(3): >>42197594 #>>42197614 #>>42198312 #
2. hinkley ◴[] No.42197594[source]
Why would it be too expensive to optimize out static subexpressions?
replies(1): >>42198445 #
3. recursive ◴[] No.42197614[source]
It's not obviously true at all. Optimizing out `'' = 'x'` can be done for a fixed cost regardless of record count.
replies(1): >>42198538 #
4. jiggawatts ◴[] No.42198312[source]
> SQLite

Well... there's your problem. SQLite is not a general-purpose RDBMS, it is marketed as a replacement for "fopen()", a purpose for which it excels.

A similar product is the Microsoft Jet database engine, used in products such as Microsoft Exchange and Active Directory. Queries have to be more-or-less manually optimised by the developer, but they run faster and more consistently than they would with a general-purpose query engine designed for ad-hoc queries.

replies(1): >>42199975 #
5. jjice ◴[] No.42198445[source]
My guess is that the expense can be tricky to calculate since the additional optimization prior to executing the query may take longer than if the query was just able to run (depending on the dataset, of course). I wonder if it's too expensive to calculate a heuristic as well, so it just allows it to execute.

Just a guess.

6. lovasoa ◴[] No.42198538[source]
Optimizing out static expressions can be done in linear time at best. So if the number of clauses in WHERE is huge and the size of the underlying table is tiny (such as in the examples shown in the article we are commenting on), it will be better not to run the optimization.

But of course, in normal life, outside of the world of people having fun with Homomorphisms, queries are much smaller than databases.

replies(1): >>42198550 #
7. recursive ◴[] No.42198550{3}[source]
Parsing the expression in the first place is already linear time.
8. cerved ◴[] No.42199975[source]
I hate Jet with a vengeance