We were colocated in large data centers right on the major IX with redundancy. All of this was accounted for in their TCO model. We had a better switch fabric than is typical for the cloud but that didn’t materially contribute to cost. We were using AWS for overflow capacity when we exceeded the capacity of our infrastructure at the time; they wanted us to move our primary workload there.
The difference in cost could be attributed mostly to the server hardware build, and to a lesser extent the better scalability with a better network. In this case, we ended up working with Quanta on servers that had everything we needed and nothing we didn’t, optimizing heavily for bandwidth/$. We worked directly with storage manufacturers to find SKUs that stripped out features we didn’t need and optimized for cost per byte given our device write throughput and durability requirements. They all have hundreds of custom SKUs that they don’t publicly list, you just have to ask. A hidden factor is that the software was designed to take advantage of hardware that most enterprises would not deign to use for high-performance applications. There was a bit of supply chain management but we did this as a startup buying not that many units. The final core server configuration cost us just under $8k each delivered, and it outperformed every off-the-shelf server for twice the price and essentially wasn’t something you could purchase in the cloud (and still isn’t). These servers were brilliant, bulletproof, and exceptionally performant for our use case. You can model out the economics of this and the zero-crossing shows up at a lower burn rate than I think many people imagine.
We were extremely effective at using storage, and we did not attach it to expensive, overly-powered servers where the CPUs would have been sitting idle anyway. The sweet spot was low-clock high-core CPUs, which are typically at a low-mid price point but optimal performance-per-dollar if you can effectively scale software to the core count. Since the software architecture was thread-per-core, the core count was not a bottleneck. The economics have not shifted much over time.
AWS uses the same pricing model as everyone else in the server leasing game. Roughly speaking, you model your prices to recover your CapEx in 6 months of utilization. Ignoring overhead, doing it ourselves pulled that closer to 1.5-2 months for the same burn. This moves a lot of the cost structure to things like power, space, and bandwidth. We definitely were paying more for space and power than AWS (usually less for bandwidth) but not nearly enough to offset our huge CapEx advantage relative to workload.
All of this can be modeled out in Excel. No one does it anymore but I am from a time when it was common, so I have that skill in my back pocket. It isn’t nearly as much work as it sounds like, much of the details are formulaic. You do need to have good data on how your workload uses hardware resources to know what to build.