←back to thread

75 points dm03514 | 1 comments | | HN request time: 0.264s | source

Hello Everyone! We built SQLFlow as a lightweight stream processing engine.

We leverage DuckDB as the stream processing engine, which gives SQLFlow the ability to process 10's of thousands of messages a second using ~250MiB of memory!

DuckDB also supports a rich ecosystem of sinks and connectors!

https://sql-flow.com/docs/category/tutorials/

https://github.com/turbolytics/sql-flow

We were tired of running JVM's for simple stream processing, and also of bespoke one off stream processors

I would love your feedback, criticisms and/or experiences!

Thank you

Show context
pulkitsh1234 ◴[] No.46196866[source]
(not an expert in stream processing).. from the docs here https://sql-flow.com/docs/introduction/basics#output-sink it seems like this works on "batches" of data, how is this different from batch processing ? Where is the "stream" here ?
replies(1): >>46197175 #
1. dm03514 ◴[] No.46197175[source]
Ha Yes! A pipeline assumes a "batch" of data, which is backed by an ephemeral duckdb in memory table. The goal is to provide SQL table semantics and implement pipelines in a way where the batch size can be toggled without a change to the pipeline logic.

The stream is achieved by the continuous flow of data from Kafka.

SQLFlow exposes a variable for batch size. Setting the batch size to 1 will make it so SQLFlow reads a kafka message, applies the processor SQL logic and then ensures it successfully commits the SQL results to the sink, one after another.

SQLFlow provides at least once delivery guarantees. It will only commit the source message once it successfully writes to the pipeline output (sink).

https://sql-flow.com/docs/operations/handling-errors

The batch table is just a convention which allows for seamless batch size configuration. If your throughput is low, or if you require message by message processing, SQLFlow can be toggled to a batch of 1. If you need higher throughput and can tolerate the latency, then the batch can be toggled higher.