Most active commenters

    ←back to thread

    544 points tosh | 13 comments | | HN request time: 0.48s | source | bottom
    1. gatienboquet ◴[] No.43464396[source]
    So today is Qwen. Tomorrow a new SOTA model from Google apparently, R2 next week.

    We haven't hit the wall yet.

    replies(6): >>43464672 #>>43464706 #>>43464975 #>>43465234 #>>43465549 #>>43472639 #
    2. zamadatix ◴[] No.43464672[source]
    Qwen 3 is coming imminently as well https://github.com/huggingface/transformers/pull/36878 and it feels like Llama 4 should be coming in the next month or so.

    That said none of the recent string of releases has done much yet to "smash a wall", they've just met the larger proprietary models where they already were. I'm hoping R2 or the like really changes that by showing ChatGPT 3->3.5 or 3.5->4 level generational jumps are still possible beyond the current state of the art, not just beyond current models of a given size.

    replies(1): >>43468250 #
    3. tomdekan ◴[] No.43464706[source]
    Any more info on the new Google model?
    4. behnamoh ◴[] No.43464975[source]
    Google's announcements are mostly vaporware anyway. Btw, where is Gemini Ultra 1? how about Gemini Ultra 2?
    replies(2): >>43465070 #>>43468100 #
    5. karmasimida ◴[] No.43465070[source]
    It is already on the LLM arena right, codename nebula? But you are right they can fuck up their releases royally.
    6. OsrsNeedsf2P ◴[] No.43465234[source]
    > We haven't hit the wall yet.

    The models are iterative improvements, but I haven't seen night and day differences since GPT3 and 3.5

    replies(3): >>43465478 #>>43467288 #>>43468261 #
    7. anon373839 ◴[] No.43465478[source]
    Yeah. Scaling up pretraining and huge models appears to be done. But I think we're still advancing the frontier in the other direction -- i.e., how much capability and knowledge can we cram into smaller and smaller models?
    8. nwienert ◴[] No.43465549[source]
    We've slid into the upper S curve though.
    9. Davidzheng ◴[] No.43467288[source]
    Tbh such a big jump from current capability would be ASI already
    10. aoeusnth1 ◴[] No.43468100[source]
    I guess they don’t do ultras anymore, but where was the announcement for it? What other announcement was vaporware?
    11. YetAnotherNick ◴[] No.43468250[source]
    > met the larger proprietary models where they already were

    This is smashing the wall.

    Also if you just care about breaking absolute numbers, OpenAI released 4.5 a month back which is SOTA in base model, planning to release O3 full in maybe a month, and Deepseek released new V3 which is again SOTA in many aspects.

    12. YetAnotherNick ◴[] No.43468261[source]
    Because 3.5 has a new capability which is following instructions. Right now we are in 3.5 range in conversation AI and native image generation, both of which feels magical.
    13. intalentive ◴[] No.43472639[source]
    Asymptotic improvement will never hit the wall