(github.com)

1311 points msoad | 3 comments | 31 Mar 23 20:37 UTC | HN request time: 0.638s | source

Show context

DoctorOetker ◴[01 Apr 23 14:41 UTC] No.35400653[source]▶

Is there a reason Llama is getting so much attention compared to say T5 11B?

Not sure how neutral or what benchmarks are used on the following link, but T5 seems to sit a lot higher on this leaderboard?

https://accubits.com/large-language-models-leaderboard/

replies(2): >>35400879 #>>35400909 #

1. itake ◴[01 Apr 23 15:09 UTC] No.35400879[source]▶

>>35400653 #

llama can run on an m1. T5 still needs a specialized gpu

replies(1): >>35401284 #

2. DoctorOetker ◴[01 Apr 23 15:57 UTC] No.35401284[source]▶

>>35400879 (TP) #

What is the reason T5 needs a specialized GPU and Llama doesn't?

In the end they are mathematical models, so what would prevent someone from loading T5 into a machine with plenty of RAM (like a server)? Would the codebase truly require that much refactoring? How difficult would it be to rewrite the model arhitecture as a set of mathematical equations (Einstein summation) and reimplement inference for CPU?

replies(1): >>35404254 #

3. itake ◴[01 Apr 23 21:28 UTC] No.35404254[source]▶

>>35401284 #

I'm far from an expert in this area. But llama has been updated so anyone can hack with it on their m1 macbook (which many developers have). If someone updated T5 to be as easy to dev against, then I am sure they would see similar community interest.

Most people don't have the hardware or budget to access these specialized high vram GPUs.

↑

Llama.cpp 30B runs with only 6GB of RAM now