SQLite async connection pool for high-performance

(github.com)

197 points slaily | 2 comments | 11 Jul 25 10:26 UTC | HN request time: 0s | source

Show context

d1l ◴[14 Jul 25 21:08 UTC] No.44565326[source]▶

This is strange on so many levels.

SQLite does not even do network I/O.

How does sharing a connection (and transaction scope) in an asyncio environment even work? Won’t you still need a connection per asyncio context?

Does sqlite_open really take long compared to the inevitable contention for the write lock you’ll see when you have many concurrent contexts?

Does sqlite_open even register in comparison with the overhead of the python interpreter?

What is an asyncio SQLite connection anyways? Isn’t it just a regular one that gets hucked into a separate thread?

replies(4): >>44565652 #>>44568010 #>>44570930 #>>44572931 #

simonw ◴[14 Jul 25 21:37 UTC] No.44565652[source]▶

>>44565326 #

If you're talking to a 100KB SQLite database file this kind of thing is likely unnecessary, just opening and closing a connection for each query is probably fine.

If you're querying a multi-GB SQLite database there are things like per-connection caches that may benefit from a connection pool.

> What is an asyncio SQLite connection anyways? Isn’t it just a regular one that gets hucked into a separate thread?

Basically yes - aiosqlite works by opening each connection in a dedicated thread and then sending async queries to it and waiting for a response that gets sent to a Future.

https://github.com/omnilib/aiosqlite/blob/895fd9183b43cecce8...

replies(2): >>44565664 #>>44565791 #

crazygringo ◴[14 Jul 25 21:40 UTC] No.44565664[source]▶

>>44565652 #

> If you're querying a multi-GB SQLite database

In which case SQLite is probably the wrong tool for the job, and you should be using Postgres or MySQL that is actually designed from the ground up for lots of concurrent connections.

SQLite is amazing. I love SQLite. But I love it for single-user single-machine scenarios. Not multi-user. Not over a network.

replies(4): >>44565686 #>>44566048 #>>44567163 #>>44570593 #

simonw ◴[14 Jul 25 21:42 UTC] No.44565686[source]▶

>>44565664 #

Multi-GB is tiny these days.

I didn't say anything about concurrent access. SQLite with WAL mode is fine for that these days for dozens of concurrent readers/writers (OK only one writer gets to write at a time, but if your writes queue for 1-2ms who cares?) - if you're dealing with hundreds or thousands over a network then yeah, use a server-based database engine.

replies(3): >>44565990 #>>44567652 #>>44570472 #

da_chicken ◴[15 Jul 25 03:44 UTC] No.44567652[source]▶

>>44565686 #

Multi GB is tiny, but that doesn't make SQLite magically better at large queries of multi GB databases. That's why DuckDB has been getting more popular.

replies(1): >>44567893 #

benjiro ◴[15 Jul 25 04:44 UTC] No.44567893[source]▶

>>44567652 #

Sqlite != DuckDB... two totally different DB types. One is a row based, the other is a column based database. Both run different workloads and both can handle extreme heavy workloads.

replies(1): >>44575597 #

1. da_chicken ◴[15 Jul 25 20:41 UTC] No.44575597[source]▶

>>44567893 #

Yes, that's the point I'm making. If SQLite didn't ever struggle with databases in the GB ranges, then there wouldn't be much call to replace it with DuckDB. The fact that there's significant value in an OLAP RDBMS suggests that SQLite is falling short.

replies(1): >>44611057 #

2. benjiro ◴[18 Jul 25 23:39 UTC] No.44611057[source]▶

>>44575597 (TP) #

The problem is not SQLite struggling with databases in GB range. It does that with ease. OLAP requires a different database structure, namely column storage (preferably with compacting / compression / and other algorithms).

That is DuckDB its selling point. You want data analyzing, you go DuckDB. You want oltp you go SQLite. Or combine both if you need both.

Even postgres struggles with OLAP dataloads, and that is why we have solutions like TimescaleDB / postgres plugin. That ironically uses postgres rows but then packs information as column into columns row fields.

That does not mean that postgres is flawed working with big data. Same with Sqlite... Different data has different needs, and has nothing to do with database sizes.

↑