←back to thread

Getting 50% (SoTA) on Arc-AGI with GPT-4o

(redwoodresearch.substack.com)
394 points tomduncalf | 1 comments | | HN request time: 0.21s | source
Show context
eigenvalue ◴[] No.40712174[source]
The Arc stuff just felt intuitively wrong as soon as I heard it. I don't find any of Chollet's critiques of LLMs to be convincing. It's almost as if he's being overly negative about them to make a point or something to push back against all the unbridled optimism. The problem is, the optimism really seems to be justified, and the rate of improvement of LLMs in the past 12 months has been nothing short of astonishing.

So it's not at all surprising to me to see Arc already being mostly solved using existing models, just with different prompting techniques and some tool usage. At some point, the naysayers about LLMs are going to have to confront the problem that, if they are right about LLMs not really thinking/understanding/being sentient, then a very large percentage of people living today are also not thinking/understanding/sentient!

replies(11): >>40712233 #>>40712290 #>>40712304 #>>40712352 #>>40712385 #>>40712431 #>>40712465 #>>40712713 #>>40713110 #>>40713491 #>>40714220 #
imtringued ◴[] No.40712465[source]
Yeah I agree. We have reached the end of LLMs. LLMs are infallible and require no further improvement. Anyone who points out shortcomings of current architectures and training approaches should be ignored as a naysayer. Anyone who proposes a solution to perceived flaws is a crank trying to fix something that was never broken. Everyone knows humans are incapable of internal monologues or visualization and vocalisation. Humans don't actually move their lips to speak to produce a sound that can be interpreted by a speaker of the same language, they produce universally understood tokens encoding objective reality and the fact that they use the local language is merely a habit that is hard to break out of.
replies(1): >>40720903 #
1. mrtranscendence ◴[] No.40720903[source]
Sometimes, when I'm undertaking the arduous work of assigning probabilities to everything I could possibly say next in a conversation, I wish that I weren't merely a stochastic autoregressive next-token generator. Them's the breaks, though.