In addition to the PostgreSQL `SQLITE` type, pglite-fusion provides the `query_sqlite`` function for querying SQLite databases and the `execute_sqlite` function for updating them. Additional functions are listed in the project’s README.
The pglite-fusion extension is written in Rust using the pgrx framework [1].
----
Implementation Details
The PostgreSQL `SQLITE` type is stored as a CBOR-encoded `Vec<u8>`. When a query is made, this `Vec<u8>` is written to a random file in the `/tmp` directory. SQLite then loads the file, performs the query, and returns the result as a table containing a single row with an array of JSON-encoded values.
The `execute_sqlite` function follows a similar process. However, instead of returning query results, it returns the contents of the SQLite file (stored in `/tmp`) as a new `SQLITE` instance.
But I'm still having trouble trying to grok the intricacies of it. In a sense, I guess it has well isolated individual SQLite DBs and you'd have to go out of your way to join over them. With that said, does PostgreSQL manage and pool all the writes correctly? So you don't need to worry about SQLite concurrency issues?
The most interesting one for me is if you're running a SaaS product like Notion where your users create custom applications that manage their own small schema-based data tables.
Letting users create full custom PostgreSQL tables can get complex - do you want to manage tens of thousands of weird custom tables in a PostgreSQL schema somewhere?
I'd much rather manage tens of thousands of rows in a table where one of the columns is a BLOB with a little SQLite database in it.
So at the backend you have a postgres database that contains the device details etc as well as the operating parameters for that device.
You can update the operating parameters as part of a postgres transaction so either all the BLOBs are updated, or none.
Using /tmp on the postgres cluster (server) is a bit of a hack, it would be nicer to have memory based SQLite blobs.
In terms of security, you get Postgres row level security, so each SQLite value is protected in the same way as the rest of the row.
Each of the columns that are databases would be updated when the functions execute.
You could do weird crap like INSERT/DELETE as part of a postgres level SELECT.
CREATE TABLE tenants (
id BIGINT NOT NULL,
database SQLITE DEFAULT execute_sqlite(
empty_sqlite(),
'CREATE TABLE users (etc.)'
and all the other tables
for each tenant
)
);
then they don't need to make joins between sqlite dbs.Your other concerns are very real. Those sqlite dbs could become very large. I prefer the use case depicted in another reply: preparing sqlite dbs before shipping them to their own devices. Or maybe receiving them and performing analysis, maybe after having imported it in overall psql tables. Or similar scenarios in which all the db is read or written at once. Anyway, once we have a tool we start using it.
Yea, I'd be fine with that - postgres has the concept of databases and schemas within those databases. If you really want to build a product like that I'd suggest starting with per-tenant schemas that leverage table inheritance as appropriate. The permissions would be pretty easy to manage.
Though, in a lot of cases I've actually seen this done every client ends up with a dedicated server (or container - whatever tech you use to do it, something completely isolated from other instances) because user version management ends up being a huge issue. When you're building something that custom it's highly likely that version migrations need to be done with client oversight to ensure everything actually works.
I have yet to find an actual real world case where the inner-platform effect is the right solution. Usually when tools like that are selected the software ends up being so generic and flexible that's it's useless. Custom application/BI environment development relies on really judiciously telling users they can't have most features - with the hard part being figuring out which features are necessary and which ones you can cut to reduce bloat.
The per-tenant schema could be the tenant's responsibility. Most non-technical users can handle the idea of tables & columns, assuming you leverage UI/UX patterns they are already familiar with.
The extension could also provide custom index access methods (considering that SQLite only has a handful of column types in the first place.) That would allow you to incorporate the keys in the index heaps, as opposed to table heaps, boom, you get bitmap index scans for Joins, i.e. GIN but with a bit more redundancy.
As long as we never add new features, never need to change how we map UI <-> Postgres DDL, and our users never make any mistakes when they change their tables, it could work without being a complexity nightmare
So, instead of saving the client sqlite db of the org to cloud storage you save it to the centralized db column instead. Litefs probably doesn’t support it yet, but wouldn’t be too hard to add.
I normally don't like using JSONB when I could have a rigorous schema, but this sort of application seems reasonable.