Show HN: Embed an SQLite database in your PostgreSQL table

(github.com)

pglite-fusion is a PostgreSQL extension that allows you to embed SQLite databases into your PostgreSQL tables by enabling the creation of columns with the `SQLITE` type. This means every row in the table can have an embedded SQLite database.

In addition to the PostgreSQL `SQLITE` type, pglite-fusion provides the `query_sqlite`` function for querying SQLite databases and the `execute_sqlite` function for updating them. Additional functions are listed in the project’s README.

The pglite-fusion extension is written in Rust using the pgrx framework [1].

----

Implementation Details

The PostgreSQL `SQLITE` type is stored as a CBOR-encoded `Vec<u8>`. When a query is made, this `Vec<u8>` is written to a random file in the `/tmp` directory. SQLite then loads the file, performs the query, and returns the result as a table containing a single row with an array of JSON-encoded values.

The `execute_sqlite` function follows a similar process. However, instead of returning query results, it returns the contents of the SQLite file (stored in `/tmp`) as a new `SQLITE` instance.

[1] https://github.com/pgcentralfoundation/pgrx

Show context

robertclaus ◴[19 Nov 24 13:36 UTC] No.42183246[source]▶

>>42182146 (OP) #

What are the use cases for this? I can't imagine designing a database schemas to use this in a typical product. Is it intended for hybrid applications to back up local user data directly with their account info?

replies(4): >>42183347 #>>42183361 #>>42183493 #>>42185818 #

1. bravura ◴[19 Nov 24 13:45 UTC] No.42183347[source]▶

>>42183246 #

The top line of the README says: "Embed an SQLite database in your PostgreSQL table. AKA multitenancy has been solved."

But I'm still having trouble trying to grok the intricacies of it. In a sense, I guess it has well isolated individual SQLite DBs and you'd have to go out of your way to join over them. With that said, does PostgreSQL manage and pool all the writes correctly? So you don't need to worry about SQLite concurrency issues?

replies(3): >>42183542 #>>42183581 #>>42183609 #

2. rswail ◴[19 Nov 24 14:04 UTC] No.42183542[source]▶

>>42183347 (TP) #

Each of the columns is an instance of a SQLite database, so I assume (without looking at the source) that they properly multi-thread as needed.

So there's not cross-SQLite-database connections or multiple writers going on.

3. rswail ◴[19 Nov 24 14:07 UTC] No.42183581[source]▶

>>42183347 (TP) #

You could join over them, but not really in the way you're thinking.

Each of the columns that are databases would be updated when the functions execute.

You could do weird crap like INSERT/DELETE as part of a postgres level SELECT.

replies(1): >>42186004 #

4. pmontra ◴[19 Nov 24 14:11 UTC] No.42183609[source]▶

>>42183347 (TP) #

If by solving multinenancy they mean

  CREATE TABLE tenants (
    id     BIGINT NOT NULL,
    database SQLITE DEFAULT execute_sqlite(
        empty_sqlite(),
        'CREATE TABLE users (etc.)'
        and all the other tables
        for each tenant
    )
  );

then they don't need to make joins between sqlite dbs.

Your other concerns are very real. Those sqlite dbs could become very large. I prefer the use case depicted in another reply: preparing sqlite dbs before shipping them to their own devices. Or maybe receiving them and performing analysis, maybe after having imported it in overall psql tables. Or similar scenarios in which all the db is read or written at once. Anyway, once we have a tool we start using it.

replies(1): >>42185147 #

5. tucnak ◴[19 Nov 24 16:20 UTC] No.42185147[source]▶

>>42183609 #

> then they don't need to make joins between sqlite dbs.

The extension could also provide custom index access methods (considering that SQLite only has a handful of column types in the first place.) That would allow you to incorporate the keys in the index heaps, as opposed to table heaps, boom, you get bitmap index scans for Joins, i.e. GIN but with a bit more redundancy.

6. Tostino ◴[19 Nov 24 17:31 UTC] No.42186004[source]▶

>>42183581 #

You can do that with any function already. This isn't new because of nested databases.

↑