←back to thread

75 points dm03514 | 2 comments | | HN request time: 0.484s | source

Hello Everyone! We built SQLFlow as a lightweight stream processing engine.

We leverage DuckDB as the stream processing engine, which gives SQLFlow the ability to process 10's of thousands of messages a second using ~250MiB of memory!

DuckDB also supports a rich ecosystem of sinks and connectors!

https://sql-flow.com/docs/category/tutorials/

https://github.com/turbolytics/sql-flow

We were tired of running JVM's for simple stream processing, and also of bespoke one off stream processors

I would love your feedback, criticisms and/or experiences!

Thank you

Show context
mihevc ◴[] No.46195958[source]
How does this compare to https://github.com/Query-farm/tributary ?
replies(2): >>46196154 #>>46196322 #
1. dm03514 ◴[] No.46196154[source]
Oh yes!! I've seen this a couple times. I am far from an expert in tributary so please take with a grain of salt.

Based on the tributary documentation, I understand that tributary embeds kafka consumers into duckdb. This makes duckdb the main process that you run to perform consumption. I think that this makes creating stream processing POCs very accessible. It looks like it is quite easy to start streaming data into duckdb. What I don't see is a full story around Devops, operations, testing, configuration as code etc.

SQLFlow is a service that embeds DuckDB as the storage and processing brains. Because of this, we're able to offer metrics, testing utilities, pipelines as code, and all the other DevOps utilities that are necessary to run a huge number of streaming instances 24x7. SQLFlow was created as a tool that I wish I had to for simple stream processing in production in high availability contexts :)

replies(1): >>46196283 #
2. mihevc ◴[] No.46196283[source]
Nice! Thanks for the context, it's great to know!