Getting 50% (SoTA) on Arc-AGI with GPT-4o

(redwoodresearch.substack.com)

394 points tomduncalf | 2 comments | 17 Jun 24 21:51 UTC | HN request time: 0.399s | source

Show context

eigenvalue ◴[17 Jun 24 23:00 UTC] No.40712174[source]▶

The Arc stuff just felt intuitively wrong as soon as I heard it. I don't find any of Chollet's critiques of LLMs to be convincing. It's almost as if he's being overly negative about them to make a point or something to push back against all the unbridled optimism. The problem is, the optimism really seems to be justified, and the rate of improvement of LLMs in the past 12 months has been nothing short of astonishing.

So it's not at all surprising to me to see Arc already being mostly solved using existing models, just with different prompting techniques and some tool usage. At some point, the naysayers about LLMs are going to have to confront the problem that, if they are right about LLMs not really thinking/understanding/being sentient, then a very large percentage of people living today are also not thinking/understanding/sentient!

replies(11): >>40712233 #>>40712290 #>>40712304 #>>40712352 #>>40712385 #>>40712431 #>>40712465 #>>40712713 #>>40713110 #>>40713491 #>>40714220 #

1. lassoiat ◴[18 Jun 24 01:17 UTC] No.40713110[source]▶

>>40712174 #

I am a chatGPT fan boy and have been quite impressed by 4o but I will really be impressed when it stops inventing aspects of python libraries that don't exists and instead just tells me it doesn't exist.

It literally just did this for me 15 minutes ago. You can't talk about AGI when it is this easy to push it over the edge into something it doesn't know.

Paper references have got better the last 12 months but just this week it made up both a book and paper for me that do not exist. The authors exist and they did not write what it said they did.

It is very interesting if you ask "do you understand your responses?" sometimes it will say yes and sometimes it will so no not like a human understands.

We should forget about AGI until it can at least say it doesn't know something. It is hardly a sign of intelligence in humans to make up answers to questions you don't know.

replies(1): >>40713818 #

2. motoxpro ◴[18 Jun 24 03:33 UTC] No.40713818[source]▶

>>40713110 (TP) #

Every time you’re wrong and you disagree with someone who is right you are inventing things that don’t exist.

Unless you’re saying you have never held on to a wrong opinion that was at some point proven to be wrong?

↑