←back to thread

116 points ndhandala | 1 comments | | HN request time: 0s | source
Show context
drivenextfunc ◴[] No.45085422[source]
Has anyone used OpenTelemetry for long-running batch jobs? OTel seems designed for web apps where spans last seconds/minutes, but batch jobs run for hours or days. Since spans are only submitted after completion, there's no way to track progress during execution, making OTel nearly unusable for batch workloads.

I have a similar issue with Prometheus -- not great for batch job metrics either. It's frustrating how many otherwise excellent OSS tools are optimized for web applications but fall short for batch processing use cases.

replies(5): >>45086098 #>>45087174 #>>45092745 #>>45103972 #>>45107467 #
1. scottgg ◴[] No.45092745[source]
You could use span links for this. The idea is you have a bunch of discrete traces that indicate they are downstream or upstream of some other trace. You’d just have to bend it a bit to work in your probably single process batch executor !