Our approach has been to take pieces of DF (including the SQL frontend and expression engine) but embedding them in our own dataflow and operators. This allows us to support low latency, distribution, watermark processing, and consistent checkpointing.
But the great thing about DF is that it’s designed as a toolkit for SQL-oriented data processing, so it’s relatively easy to pick and use just the pieces you need.
crossing fingers for solutions like `https://github.com/feldera/feldera` to be wrapped in a nice database, `https://materialize.com/` to solve their memory issues, or `https://clickhouse.com/docs/en/materialized-view` to solve reliable streaming consumption.
Various streaming processing frameworks often have domain specific languages with a lot of limitations of how to express aggregations and transformations.
(As an aside, feldera doesn't want to be embedded into your app, materialize either, and clickhouse might just pull a great streaming library out from nowhere, they seem to be good at just doing stuff like that).
Disclaimer: I work at Materialize
Recently there have been major improvements in Materialize's memory usage as well as using disk to swap out some data.
I find it pretty easy to hook up to Postgres/MySQL/Kafka instances: https://materialize.com/blog/materialize-emulator/