Alibaba Cloud claims to reduce Nvidia GPU used for serving unpopular models by 82% (emphasis mine)
> 17.7 per cent of GPUs allocated to serve only 1.35 per cent of requests in Alibaba Cloud’s marketplace, the researchers found
Instead of 1192 GPUs they now use 213 for serving those requests.
replies(5):