Show HN: Embed an SQLite database in your PostgreSQL table

(github.com)

180 points frectonz | 2 comments | 19 Nov 24 11:02 UTC | HN request time: 0.555s | source

pglite-fusion is a PostgreSQL extension that allows you to embed SQLite databases into your PostgreSQL tables by enabling the creation of columns with the `SQLITE` type. This means every row in the table can have an embedded SQLite database.

In addition to the PostgreSQL `SQLITE` type, pglite-fusion provides the `query_sqlite`` function for querying SQLite databases and the `execute_sqlite` function for updating them. Additional functions are listed in the project’s README.

The pglite-fusion extension is written in Rust using the pgrx framework [1].

----

Implementation Details

The PostgreSQL `SQLITE` type is stored as a CBOR-encoded `Vec<u8>`. When a query is made, this `Vec<u8>` is written to a random file in the `/tmp` directory. SQLite then loads the file, performs the query, and returns the result as a table containing a single row with an array of JSON-encoded values.

The `execute_sqlite` function follows a similar process. However, instead of returning query results, it returns the contents of the SQLite file (stored in `/tmp`) as a new `SQLITE` instance.

[1] https://github.com/pgcentralfoundation/pgrx

Show context

robertclaus ◴[19 Nov 24 13:36 UTC] No.42183246[source]▶

>>42182146 (OP) #

What are the use cases for this? I can't imagine designing a database schemas to use this in a typical product. Is it intended for hybrid applications to back up local user data directly with their account info?

replies(4): >>42183347 #>>42183361 #>>42183493 #>>42185818 #

simonw ◴[19 Nov 24 13:47 UTC] No.42183361[source]▶

>>42183246 #

I can think of plenty.

The most interesting one for me is if you're running a SaaS product like Notion where your users create custom applications that manage their own small schema-based data tables.

Letting users create full custom PostgreSQL tables can get complex - do you want to manage tens of thousands of weird custom tables in a PostgreSQL schema somewhere?

I'd much rather manage tens of thousands of rows in a table where one of the columns is a BLOB with a little SQLite database in it.

replies(2): >>42183779 #>>42186863 #

munk-a ◴[19 Nov 24 14:27 UTC] No.42183779[source]▶

>>42183361 #

> Letting users create full custom PostgreSQL tables can get complex - do you want to manage tens of thousands of weird custom tables in a PostgreSQL schema somewhere?

Yea, I'd be fine with that - postgres has the concept of databases and schemas within those databases. If you really want to build a product like that I'd suggest starting with per-tenant schemas that leverage table inheritance as appropriate. The permissions would be pretty easy to manage.

Though, in a lot of cases I've actually seen this done every client ends up with a dedicated server (or container - whatever tech you use to do it, something completely isolated from other instances) because user version management ends up being a huge issue. When you're building something that custom it's highly likely that version migrations need to be done with client oversight to ensure everything actually works.

I have yet to find an actual real world case where the inner-platform effect is the right solution. Usually when tools like that are selected the software ends up being so generic and flexible that's it's useless. Custom application/BI environment development relies on really judiciously telling users they can't have most features - with the hard part being figuring out which features are necessary and which ones you can cut to reduce bloat.

replies(1): >>42184217 #

jitl ◴[19 Nov 24 15:09 UTC] No.42184217[source]▶

>>42183779 #

Notion has 100 million users, managing schema-per-tenant at our scale sounds like a complexity nightmare. We have 480+ identical schemas across 100+ Postgres hosts, and that already takes a lot of brainpower & engineering time to manage T_T

replies(1): >>42185066 #

bob1029 ◴[19 Nov 24 16:14 UTC] No.42185066[source]▶

>>42184217 #

> managing schema-per-tenant at our scale sounds like a complexity nightmare.

The per-tenant schema could be the tenant's responsibility. Most non-technical users can handle the idea of tables & columns, assuming you leverage UI/UX patterns they are already familiar with.

replies(1): >>42185386 #

jitl ◴[19 Nov 24 16:40 UTC] No.42185386[source]▶

>>42185066 #

Our UI looks like a table: https://www.notion.so/help/intro-to-databases

As long as we never add new features, never need to change how we map UI <-> Postgres DDL, and our users never make any mistakes when they change their tables, it could work without being a complexity nightmare

replies(1): >>42186593 #

mbesto ◴[19 Nov 24 18:24 UTC] No.42186593[source]▶

>>42185386 #

Curious - so how do you manage client-specific schemas then? Do you just have mappings in postgres (column1, column2, column3, etc.) or maybe store a client specific schema in bson per client?

replies(1): >>42189493 #

1. jitl ◴[20 Nov 24 00:14 UTC] No.42189493[source]▶

>>42186593 #

It's all JSON in two Postgres tables: `collection` which represents a Notion Database and has a `collection.schema` JSONB column, and `block` which has a `block.properties` JSONB column that stores the property values - the stuff in the Notion Database cells - for each row. We apply "schema on read" when querying or rendering a Notion Database, and we have a service on the backend that builds indexes/caches for hot collections on the fly. The service handles all the queries for collections larger than X rows. For smaller collections, we just give the client the whole thing modulo permissions and it does the query locally.

replies(1): >>42217123 #

2. mbesto ◴[22 Nov 24 20:25 UTC] No.42217123[source]▶

>>42189493 (TP) #

Thanks! Super helpful.

↑