←back to thread

544 points tosh | 1 comments | | HN request time: 0s | source
Show context
gatienboquet ◴[] No.43464396[source]
So today is Qwen. Tomorrow a new SOTA model from Google apparently, R2 next week.

We haven't hit the wall yet.

replies(6): >>43464672 #>>43464706 #>>43464975 #>>43465234 #>>43465549 #>>43472639 #
OsrsNeedsf2P ◴[] No.43465234[source]
> We haven't hit the wall yet.

The models are iterative improvements, but I haven't seen night and day differences since GPT3 and 3.5

replies(3): >>43465478 #>>43467288 #>>43468261 #
1. anon373839 ◴[] No.43465478[source]
Yeah. Scaling up pretraining and huge models appears to be done. But I think we're still advancing the frontier in the other direction -- i.e., how much capability and knowledge can we cram into smaller and smaller models?