←back to thread

602 points emrah | 1 comments | | HN request time: 0.209s | source
Show context
api ◴[] No.43744419[source]
When I see 32B or 70B models performing similarly to 200+B models, I don’t know what to make of this. Either the latter contains more breadth of information but we have managed to distill latent capabilities to be similar, the larger models are just less efficient, or the tests are not very good.
replies(2): >>43744582 #>>43744783 #
1. retinaros ◴[] No.43744783[source]
its just bs benchmarks. they are all cheating at this point feeding the data in the training set. doesnt mean the llm arent becoming better but when they all lie...