(github.com)

1311 points msoad | 1 comments | 31 Mar 23 20:37 UTC | HN request time: 0s | source

Show context

qwertox ◴[31 Mar 23 21:36 UTC] No.35393996[source]▶

Is the 30B model clearly better than the 7B?

I played with Pi3141/alpaca-lora-7B-ggml two days ago and it was super disappointing. In percentage between 0% = alpaca-lora-7B-ggml and 100% GPT-3.5, where would LLaMA 30B be positioned?

replies(2): >>35394629 #>>35395773 #

1. sp332 ◴[01 Apr 23 00:42 UTC] No.35395773[source]▶

>>35393996 #

Check out the graph on page 3 of this PDF: https://arxiv.org/abs/2302.13971 The 33B model started beating the 7B when it had been trained on only 1/3 as much data. And then they kept training it to 40% more than the total that 7B was trained on. It's better.

↑

Llama.cpp 30B runs with only 6GB of RAM now