←back to thread

161 points openWrangler | 1 comments | | HN request time: 0.209s | source

A common open source approach to observability will begin with databases and visualizations for telemetry - Grafana, Prometheus, Jaeger. But observability doesn’t begin and end here: these tools require configuration, dashboard customization, and may not actually pinpoint the data you need to mitigate system risks.

Coroot was designed to solve the problem of manual, time-consuming observability analysis: it handles the full observability journey — from collecting telemetry to turning it into actionable insights. We also strongly believe that simple observability should be an innovation everyone can benefit from: which is why our software is open source.

Features:

- Cost monitoring to track and minimise your cloud expenses (AWS, GCP, Azure.)

- SLO tracking with alerts to detect anomalies and compare them to your system’s baseline behaviour.

- 1-click application profiling: see the exact line of code that caused an anomaly.

- Mapped timeframes (stop digging through Grafana to find when the incident occurred.)

- eBPF automatically gathers logs, metrics, traces, and profiles for you.

- Service map to grasp a complete at-a-glance picture of your system.

- Automatic discovery and monitoring of every application deployment in your kubernetes cluster.

We welcome any feedback and hope the tool can improve your workflow!

Show context
esafak ◴[] No.43625974[source]
What's the data transformation story; for ML on metrics?
replies(1): >>43626078 #
nikolay_sivko ◴[] No.43626078[source]
Coroot builds a model of each system, allowing it to traverse the dependency graph and identify correlations between metrics. On top of that, we're experimenting with LLMs for summarization — here are a few examples: https://oopsdb.coroot.com/failures/cpu-noisy-neighbor/
replies(1): >>43626528 #
esafak ◴[] No.43626528[source]
That looks like a built-in feature. I'm asking about extensibility. How do we use custom metrics transformations (libraries), for example?
replies(1): >>43626570 #
1. nikolay_sivko ◴[] No.43626570[source]
Currently, you can define custom SLIs (Service Level Indicators, such as service latency or error rate) for each service using PromQL queries. In the future, you'll be able to define custom metrics for each application, including explanations of their meaning, so they can be leveraged in Root Cause Analysis