Not sure how neutral or what benchmarks are used on the following link, but T5 seems to sit a lot higher on this leaderboard?
Not sure how neutral or what benchmarks are used on the following link, but T5 seems to sit a lot higher on this leaderboard?
In the end they are mathematical models, so what would prevent someone from loading T5 into a machine with plenty of RAM (like a server)? Would the codebase truly require that much refactoring? How difficult would it be to rewrite the model arhitecture as a set of mathematical equations (Einstein summation) and reimplement inference for CPU?
Anyway, T5 being available for download from Huggingface only makes my question more pertinent...
Most people don't have the hardware or budget to access these specialized high vram GPUs.
does it happen to run on CPU on a server with 96GB RAM?