←back to thread

321 points jhunter1016 | 1 comments | | HN request time: 0s | source
Show context
mikeryan ◴[] No.41878605[source]
While technical AI and LLMs are not something I’m well versed in. So as I sit on the sidelines and see the current proliferation of AI startups I’m starting to wonder where the moats are outside of access to raw computing power. Open AI seemed to have a massive lead in this space but that lead seems to be shrinking every day.
replies(10): >>41878784 #>>41878809 #>>41878843 #>>41880703 #>>41881606 #>>41882000 #>>41885618 #>>41886010 #>>41886133 #>>41887349 #
YetAnotherNick ◴[] No.41881606[source]
> Open AI seemed to have a massive lead in this space but that lead seems to be shrinking every day.

The lead is as strong as ever. They are 34 ELO above anyone else in blind testing, and 73 ELO above in coding [1]. They also seem to have artificially constrain the lead as they already have stronger model like o1 which they haven't released. Consistent to the past, they seem to release just <50 ELO above anyone else, and upgrades the model in weeks when someone gets closer.

[1]: https://lmarena.ai/

replies(2): >>41881810 #>>41884236 #
epolanski ◴[] No.41884236[source]
Idc about lmarena benchmarks, I test different models everyday in Cursor, Sonnet is way better at coding web applications than ChatGPT4o
replies(1): >>41885498 #
1. Zetaphor ◴[] No.41885498{3}[source]
Completely agree. It's well known that the LMSys arena benchmarks are heavily skewed with bias towards whatever is new and exciting. Meanwhile even OpenAI have acknowledged Sonnet as being a superior coding model.

This is clearly evident to anyone who spends any amount of time working on non-trivial projects with both models.