Pixar's Render Farm | slacker news

kubernetes can't do it at this scale "trivially"

Firstly, K8s has no concept of licenses, it also is exceptionally weak on dependencies. A job graph for a VFX job can be well over 100k nodes, something that would crash k8s.

Secondly, tractor (https://rmanwiki.pixar.com/display/TRA/Tractor+2) is exceptionally fast at dispatching jobs to machines. I suspect its in the order of 50k a second, if not more.

Thirdly, getting k8s to talk to 25k machines without saturating the network is almost impossible.

fourthly, it doesn't do to well on "normal" network, try getting decent network throughput on one of K8s batshit networking schemes(each server on a farm will have at a minimum 2 10 gig links, more likley 2 40gig)