Most active commenters

Datadog's $65M/year customer mystery solved

(blog.pragmaticengineer.com)

Show context

ljm ◴[30 Jun 25 20:24 UTC] No.44427444[source]▶

I wonder how much that no-expense-spared, money-is-no-object attitude to buying SaaS impacts an engineers ability to make sensible decisions around infra and architecture. Coinbase might have been fine blowing 65 mil but take that approach to a new startup and you could trivially eat up a significant amount of runway with it.

I won’t single out Datadog on this because the exact same thing happens with cloud spend, and it’s very literally burning money.

replies(4): >>44427650 #>>44428240 #>>44428533 #>>44428683 #

1. swyx ◴[30 Jun 25 20:44 UTC] No.44427650[source]▶

>>44427444 #

the visible cost of burning runway on a bill is very often far less than the invisible cost of burning engineer time rebuilding undifferentiated heavy lifting rather than working on product/customer needs

replies(4): >>44427744 #>>44427799 #>>44429854 #>>44430384 #

2. 9283409232 ◴[30 Jun 25 20:52 UTC] No.44427744[source]▶

>>44427650 (TP) #

People say this but I wonder about this from time to time. I don't think anyone is asking to rebuild datadog from scratch for your company but surely it's worth it to migrate to something not as expensive even if it takes a bit of elbow grease.

replies(1): >>44428690 #

3. pphysch ◴[30 Jun 25 20:57 UTC] No.44427799[source]▶

>>44427650 (TP) #

Most of the complexity in observability is clientside.

It is not hard to spin up Grafana and VictoriaMetrics (and now VictoriaLogs) and keep them running. It is not hard to build a Grafana dashboard that correlates data across both metrics and logs sources, and alerting functionality is pretty good now.

The "heavy lift" is instrumenting your applications and infrastructure to provide valuable metrics and logs without exceeding a performance budget. I'm skeptical that Datadog actually does much of that heavy-lifting and that they are actually worth the money. You can probably save 10x with same/better outcomes by paying for managed Grafana + managed DBs and a couple FTEs as observability experts.

replies(2): >>44428025 #>>44433840 #

4. lerchmo ◴[30 Jun 25 21:22 UTC] No.44428025[source]▶

>>44427799 #

You could hire 100 people to manage your timeseries data and save 70%

5. closeparen ◴[30 Jun 25 22:38 UTC] No.44428690[source]▶

>>44427744 #

Assuming there's nothing else you could do with that elbow grease that would create more value than the SaaS bill costs.

replies(1): >>44429796 #

6. 9283409232 ◴[01 Jul 25 01:46 UTC] No.44429796{3}[source]▶

>>44428690 #

Value is not a hard science. I've seen people shelve tech debt in favor of work on a feature that no one ends up using.

replies(2): >>44429841 #>>44430241 #

7. nemothekid ◴[01 Jul 25 01:54 UTC] No.44429841{4}[source]▶

>>44429796 #

1. Leadership doesn’t want to burn engineer cycles on undifferentiated features.

2. Management doesn’t get recognized for working on undifferentiated features.

3. Engineers working on undifferentiated features aren’t recognized when looking for new jobs.

Saving money “makes” sense but getting people to actually prioritize it is hard.

replies(1): >>44431152 #

8. QuinnyPig ◴[01 Jul 25 01:56 UTC] No.44429854[source]▶

>>44427650 (TP) #

This is very well stated.

replies(1): >>44436361 #

9. ◴[01 Jul 25 03:15 UTC] No.44430241{4}[source]▶

>>44429796 #

10. wavemode ◴[01 Jul 25 03:45 UTC] No.44430384[source]▶

>>44427650 (TP) #

I wouldn't really say "very often". Occasionally, perhaps.

Even from a pure zero-sum mathematical perspective, it can make sense to invest even as much as 2 or 3 months of engineer time on cloud cost savings measures. If the engineer is making $200K, that's a $30000 - $50000 investment. When you see the eye-watering cloud bills many startups have, you would realize that, that investment is peanuts in comparison to the potential savings over the next several years.

And then you also have to keep in mind that, these things are usually not actually zero-sum. The engineer could be new, and working on the efficiency project helps them onboard to your stack. It could be the case that customers are complaining (or could start complaining in the future) about how slow your product is, so you actually improve the product by improving the infrastructure. Or it could just be the very common case that there isn't actually a higher-value thing for that engineer to be working on at that time.

replies(1): >>44432140 #

11. uaas ◴[01 Jul 25 06:26 UTC] No.44431152{5}[source]▶

>>44429841 #

Well, saving money is a differentiator, and one of the best things an engineer can put on their CVs.

12. happymellon ◴[01 Jul 25 09:34 UTC] No.44432140[source]▶

>>44430384 #

> It could be the case that customers are complaining (or could start complaining in the future) about how slow your product is

If Jira has taught me anything, it's that ignoring customers when they complain its too slow makes financial sense.

13. ljm ◴[01 Jul 25 13:41 UTC] No.44433840[source]▶

>>44427799 #

I used to be quite fond of Datadog but after one or two completely surprising bills (thanks to their granular but unintuitive pricing model), I wouldn't recommend it to anybody any more. If I were more cynical I would say the pricing model is designed to be confusing so customers spend more than they need, and this is only made worse by the extreme breadth of the platform now.

These days I'd suggest to just suck it up, spin up a Grafana box, and wire up OpenTelemetry.

14. swyx ◴[01 Jul 25 17:49 UTC] No.44436361[source]▶

>>44429854 #

haha thanks Corey, i echo the best (you)

↑