It assumes that local models are inherently worse. But from a software perspective that's nonsense because there is no reason it couldn't be the exact same software. And from a hardware perspective the theory would have to be that the centralized system is using more expensive hardware, but there are two ways around that. The first is that you can sacrifice speed for cost -- x86 servers are slower than GPUs but can run huge models because they support TBs of memory. And the second is that you can, of course, buy high end local hardware, as many enterprises might choose to do, especially when they have enough internal users to keep it busy.