←back to thread

507 points martinald | 1 comments | | HN request time: 0.257s | source
Show context
moduspol ◴[] No.45052841[source]
This kind of presumes you're just cranking out inference non-stop 24/7 to get the estimated price, right? Or am I misreading this?

In reality, presumably they have to support fast inference even during peak usage times, but then the hardware is still sitting around off of peak times. I guess they can power them off, but that's a significant difference from paying $2/hr for an all-in IaaS provider.

I'm also not sure we should expect their costs to just be "in-line with, or cheaper than" what various hourly H100 providers charge. Those providers presumably don't have to run entire datacenters filled to the gills with these specialized GPUs. It may be a lot more expensive to do that than to run a handful of them spread among the same datacenter with your other workloads.

replies(4): >>45053067 #>>45053222 #>>45053374 #>>45053784 #
1. empath75 ◴[] No.45053784[source]
> In reality, presumably they have to support fast inference even during peak usage times, but then the hardware is still sitting around off of peak times. I guess they can power them off, but that's a significant difference from paying $2/hr for an all-in IaaS provider.

They can repurpose those nodes for training when they aren't being used for inference. Or if they're using public cloud nodes, just turn them off.