←back to thread

175 points frectonz | 9 comments | | HN request time: 2.319s | source | bottom

pglite-fusion is a PostgreSQL extension that allows you to embed SQLite databases into your PostgreSQL tables by enabling the creation of columns with the `SQLITE` type. This means every row in the table can have an embedded SQLite database.

In addition to the PostgreSQL `SQLITE` type, pglite-fusion provides the `query_sqlite`` function for querying SQLite databases and the `execute_sqlite` function for updating them. Additional functions are listed in the project’s README.

The pglite-fusion extension is written in Rust using the pgrx framework [1].

----

Implementation Details

The PostgreSQL `SQLITE` type is stored as a CBOR-encoded `Vec<u8>`. When a query is made, this `Vec<u8>` is written to a random file in the `/tmp` directory. SQLite then loads the file, performs the query, and returns the result as a table containing a single row with an array of JSON-encoded values.

The `execute_sqlite` function follows a similar process. However, instead of returning query results, it returns the contents of the SQLite file (stored in `/tmp`) as a new `SQLITE` instance.

[1] https://github.com/pgcentralfoundation/pgrx

Show context
TekMol ◴[] No.42183907[source]
Are there still reasons to use PostgreSQL?

I like the simplicity of SQLite's "a file is all you need" approach so much, that I started to converge all my projects to SQLite. So far, I have not come across any roadblocks.

Can anyone think of a use case where PostgreSQL is better suited than SQLite?

replies(8): >>42183960 #>>42183962 #>>42183971 #>>42183990 #>>42184022 #>>42184026 #>>42184221 #>>42184299 #
1. prisenco ◴[] No.42184022[source]
The biggest one is redundancy. Architecting with Read replicas is much easier with Postgres than Sqlite because of it's server model.

Sqlite on the server is a fantastic starter database. Dead simple to set up, highly performant and scales way higher (vertically) than anyone gives it credit for.

But there certainly is a point you'll have to scale out instead of up, and while there are some great solutions for that (rqlite, litefs, dqlite, marmot) it's not inherent to Sqlite's design.

replies(2): >>42184171 #>>42184192 #
2. TekMol ◴[] No.42184171[source]
Should replication really be a concern of the DB layer?

Replication means writing queries which alter the data to multiple machines, right?

Shouldn't that be done by a software one level up? Which takes in the queries via some network protocol and then sends them to all machines.

That would sound more logical to me.

replies(1): >>42184215 #
3. otoolep ◴[] No.42184192[source]
rqlite[1] creator here, happy to answer any questions about it.

[1] https://rqlite.io

4. prisenco ◴[] No.42184215[source]
Historically, yes. Databases were software that were concerned with both storage and networking.

It's fine to want to separate those out, but it's not easy to do so and there are reasons they've been coupled for decades.

replies(2): >>42184263 #>>42184297 #
5. ◴[] No.42184263{3}[source]
6. TekMol ◴[] No.42184297{3}[source]
What makes it hard?

Having a single DB that takes write queries via a proxy which spreads them out to multiple read-only-DBs sounds easy at first.

replies(1): >>42184729 #
7. abtinf ◴[] No.42184729{4}[source]
When do you consider the write/transaction to be completed?

What do you do about out-of-sync read replicas?

ACID gets real hard real fast when introducing replication.

replies(1): >>42184945 #
8. TekMol ◴[] No.42184945{5}[source]
> When do you consider the write/transaction to be completed?

Sending a UPDATE/INSERT/DELETE statement to SQLite is not blocking? I would think it is, because in my code I can read the number of affected rows right after I sent the query.

> What do you do about out-of-sync read replicas?

Delete them and replace them by uploading a checkpoint and replaying a log of the queries since then.

replies(1): >>42186954 #
9. Tostino ◴[] No.42186954{6}[source]
If you are doing statement level replication, you better make sure every query is run in the exact same order (and finishes in the same order).

Without that you will have drift from your master database.

With that, you have a whole new host of synchronization issues you need to deal with.