←back to thread

61 points captaintobs | 1 comments | | HN request time: 0.2s | source
Show context
bradleybuda ◴[] No.41853756[source]
I really wish data engineers didn't have to hand-roll incremental materialization in 2024. This is really hard stuff to get right (as the post outlines) but it is absolutely critical to keeping latency and costs down if you're going to go all in on deep, layered, fine-grained transformations (which still seems to me to be the best way to scale a large / complex analytics stack).

My prediction a few years back was that Materialize (or similar tech) would magically solve this - data teams could operate in terms of pure views and let the database engine differentiate their SQL and determine how to apply incremental (ideally streaming) updates through the view stack. While I'm in an adjacent space, I don't do this day-to-day so I'm not quite sure what's holding back adoption here - maybe in a few years more we'll get there.

replies(2): >>41855171 #>>41856876 #
1. tessierashpool9 ◴[] No.41856876[source]
The Databricks AutoLoader for Delta Live Tables with Checkpointing and Watermarking comes to your rescue.