←back to thread

188 points ilove_banh_mi | 3 comments | | HN request time: 1.715s | source
Show context
stego-tech ◴[] No.42172794[source]
As others have already hit upon, the problem forever lies in standardization of whatever is intended to replace TCP in the data center, or the lack thereof. You’re basically looking for a protocol supported in hardware from endpoint to endpoint, including in firewalls, switches, routers, load balancers, traffic shapers, proxies, etcetera - a very tall order indeed. Then, to add to that very expensive list of criteria, you also need the talent to support it - engineers who know it just as thoroughly as the traditional TCP/IP stack and ethernet frames, but now with the added specialty of data center tuning. Then you also need the software to support and understand it, which is up to each vendor and out of your control - unless you wrap/encapsulate it in TCP/IP anyway, in which case there goes all the nifty features you wanted in such a standard.

By the time all of the proverbial planets align, all but the most niche or cutting edge customer is looking at a project the total cost of which could fund 400G endpoint bandwidth with the associated backhaul and infrastructure to support it - twice over. It’s the problem of diminishing returns against the problem of entrenchment: nobody is saying modern TCP is great for the kinds of datacenter workloads we’re building today, but the cost of solving those problems is prohibitively expensive for all but the most entrenched public cloud providers out there, and they’re not likely to “share their work” as it were. Even if they do (e.g., Google with QUIC), the broad vibe I get is that folks aren’t likely to trust those offerings as lacking in ulterior motives.

replies(3): >>42173315 #>>42174871 #>>42179283 #
klysm ◴[] No.42173315[source]
If anybody is gonna do it, it's gonna be someone like amazon that vertically integrates through most of the hardware
replies(2): >>42173489 #>>42173592 #
stego-tech ◴[] No.42173489[source]
That’s my point: TCP in the datacenter remains a 1% problem, in the sense that only 1% of customers actually have this as a problem, and only 1% of those have the ability to invest in a solution. At that point, market conditions incentivize protecting their work and selling it to others (e.g., Public Cloud Service Providers) as opposed to releasing it into the wild as its own product line for general purchase (e.g., Cisco). It’s also why their solutions aren’t likely to ever see widespread adoption, as they built their solution for their infrastructure and their needs, not a mass market.
replies(2): >>42174545 #>>42174831 #
1. wbl ◴[] No.42174545[source]
Nevertheless Infiband exists.
replies(2): >>42174698 #>>42176020 #
2. MichaelZuo ◴[] No.42174698[source]
Which makes the prospects of a replacement for either or both even more unlikely.
3. stego-tech ◴[] No.42176020[source]
As does Fibre Channel and a myriad of other solutions out there. The point wasn’t to bring every “but X exists” or “but Company A said in this blog post they solved it” response out of the fog, but to point out that these issues are incredibly fringe to begin with yet make the rounds a few times a year every time someone has an “epiphany” about the inefficiencies of TCP/IP in their edgiest of edge-case scenarios.

TCP isn’t the most efficient protocol, sure, but it survives and thrives because of its flexibility, cost, and broad adoption. For everything else, there’s undoubtedly something that already exists to solve your specific gripe about it.