Transactional Object Storage?

(blog.mbrt.dev)

93 points mbrt | 2 comments | 17 Nov 24 13:20 UTC | HN request time: 0s | source

Show context

Onavo ◴[18 Nov 24 06:13 UTC] No.42170159[source]▶

Congrats on reinventing the data lake? This is actually how most of the newer generations of "cloud native" databases work, where they separate compute and storage. The key is that they have a more sophisticated caching layer so that the latency cost of a query can be amortized across requests.

replies(2): >>42170647 #>>42238491 #

mbrt ◴[18 Nov 24 08:09 UTC] No.42170647[source]▶

>>42170159 #

It's my understanding that the newer generation of data lakes still make use of a tiny, strongly consistent metadata database to keep track of what is where. This is orders of magnitudes smaller than what you'd have by putting everything in the same database, but it's still there. This is also the case in newer data streaming platforms (e.g. https://www.warpstream.com/blog/kafka-is-dead-long-live-kafk...).

I'm curious to hear if you have examples of any database using only object storage as a backend, because back when I started, I couldn't fin any.

replies(3): >>42170771 #>>42239063 #>>42241578 #

1. eatonphil ◴[25 Nov 24 18:58 UTC] No.42239063[source]▶

>>42170647 #

> I'm curious to hear if you have examples of any database using only object storage as a backend, because back when I started, I couldn't fin any.

Take a look at Delta Lake

https://notes.eatonphil.com/2024-09-29-build-a-serverless-ac...

replies(1): >>42250063 #

2. mbrt ◴[26 Nov 24 21:05 UTC] No.42250063[source]▶

>>42239063 (TP) #

Wow, not sure how I missed this, but I see many similarities. They were also bitten by lack of conditional writes in S3:

> In Databricks service deployments, we use a separate lightweight coordination service to ensure that only one client can add a record with each log ID.

The key difference is that Delta Lake implements MVCC and relies on total ordering of transaction IDs. Something I didn't want to do to avoid forced synchronization points (multiple clients need to fight for IDs). This is certainly a trade-off, because in my case you are forced to read the latest version or retry (but then you get strict serializability), while in Delta Lake you can rely on snapshot isolation, which might give you slightly stale, but consistent data and minimize retries on reads.

It also seems that you can't get transactions across different tables? Another interesting tradeoff.

↑