Microsoft and OpenAI's close partnership shows signs of fraying

(www.nytimes.com)

326 points jhunter1016 | 1 comments | 18 Oct 24 11:11 UTC | HN request time: 0.199s | source

Show context

mikeryan ◴[18 Oct 24 12:04 UTC] No.41878605[source]▶

While technical AI and LLMs are not something I’m well versed in. So as I sit on the sidelines and see the current proliferation of AI startups I’m starting to wonder where the moats are outside of access to raw computing power. Open AI seemed to have a massive lead in this space but that lead seems to be shrinking every day.

replies(10): >>41878784 #>>41878809 #>>41878843 #>>41880703 #>>41881606 #>>41882000 #>>41885618 #>>41886010 #>>41886133 #>>41887349 #

YetAnotherNick ◴[18 Oct 24 17:30 UTC] No.41881606[source]▶

>>41878605 #

> Open AI seemed to have a massive lead in this space but that lead seems to be shrinking every day.

The lead is as strong as ever. They are 34 ELO above anyone else in blind testing, and 73 ELO above in coding [1]. They also seem to have artificially constrain the lead as they already have stronger model like o1 which they haven't released. Consistent to the past, they seem to release just <50 ELO above anyone else, and upgrades the model in weeks when someone gets closer.

[1]: https://lmarena.ai/

replies(2): >>41881810 #>>41884236 #

epolanski ◴[18 Oct 24 22:57 UTC] No.41884236[source]▶

>>41881606 #

Idc about lmarena benchmarks, I test different models everyday in Cursor, Sonnet is way better at coding web applications than ChatGPT4o

replies(1): >>41885498 #

1. Zetaphor ◴[19 Oct 24 03:45 UTC] No.41885498[source]▶

>>41884236 #

Completely agree. It's well known that the LMSys arena benchmarks are heavily skewed with bias towards whatever is new and exciting. Meanwhile even OpenAI have acknowledged Sonnet as being a superior coding model.

This is clearly evident to anyone who spends any amount of time working on non-trivial projects with both models.

↑