Most active commenters

    ←back to thread

    353 points HunOL | 12 comments | | HN request time: 0.001s | source | bottom
    Show context
    mickeyp ◴[] No.45781619[source]
    SQLite is a cracking database -- I love it -- that is let down by its awful defaults in service of 'backwards compatibility.'

    You need a brace of PRAGMAs to get it to behave reasonably sanely if you do anything serious with it.

    replies(2): >>45781956 #>>45781998 #
    1. mkoubaa ◴[] No.45781998[source]
    Seems like it's asking to be forked
    replies(3): >>45782143 #>>45782341 #>>45783433 #
    2. justin66 ◴[] No.45782143[source]
    It has been forked at least once:

    https://docs.turso.tech/libsql

    replies(1): >>45790067 #
    3. kbolino ◴[] No.45782341[source]
    SQLite is fairly fork-resistant due to much of its test suite being proprietary: https://www.sqlite.org/testing.html
    4. pstuart ◴[] No.45783433[source]
    The real fork is DuckDB in a way, it has SQLite compatibility and so much more.

    The SQLite team also has 2 branches that address concurrency that may someday merge to trunk, but by their very nature they are quite conservative and it may never happen unless they feel it passes muster.

    https://www.sqlite.org/src/doc/begin-concurrent/doc/begin_co... https://sqlite.org/hctree/doc/hctree/doc/hctree/index.html

    As to the problem that prompted the article, there's another way of addressing the problem that is kind of a kludge but is guaranteed to work in scenarios like theirs: Have each thread in the parallel scan write to it's own temporary database and then bulk import them once the scan is done.

    It's easy to get hung up on having "a database" but sharding to different files by use is trivial to do.

    Another thing to bear in mind with a lot of SQLite use cases is that the data is effectively read only save for occasional updates. Read only databases are a lot easier to deal with regarding locking.

    replies(3): >>45783708 #>>45783826 #>>45790073 #
    5. jitl ◴[] No.45783708[source]
    DuckDB is similar as an in process SQL database, but lacking btree-style ordered indexes makes it a poor performer in key lookups and order-by / range scans if your table is any size larger than trivial.

    It’s the classic OLAP (DuckDB) vs OLTP (SQLite) trade off between the two. DuckDB is very good at many things but most applications that need a traditional SQL DB will probably not perform well if you swap it over to DuckDB.

    replies(2): >>45783833 #>>45784046 #
    6. Kinrany ◴[] No.45783826[source]
    > Read only databases are a lot easier to deal with regarding locking.

    "A lot easier" sounds like an understatement. What's there to lock when the data is read only?

    7. Kinrany ◴[] No.45783833{3}[source]
    That's surprising, surely OLAP use cases also need key lookups?
    8. geysersam ◴[] No.45784046{3}[source]
    Duckdb has optional adaptive radix tree indexing (https://duckdb.org/docs/stable/sql/indexes.html)
    replies(1): >>45786034 #
    9. jitl ◴[] No.45786034{4}[source]
    Oops, I stand corrected!

    What I remember about our evaluation of DuckDB in 2024 concluded that (1) the major limitations were lack of range-scan and index-lookup performance (maybe w/ joins? or update where?), and (2) the DuckDB Node.js module segfaulted too much. Perhaps the engineers somehow missed the ART index it could also be the restriction that data fit in memory to create an index on it (our test dataset was about 50gb)

    10. fulafel ◴[] No.45790067[source]
    How are the defaults there?
    replies(1): >>45792348 #
    11. ◴[] No.45790073[source]
    12. justin66 ◴[] No.45792348{3}[source]
    The default is, don't use it.