(redwoodresearch.substack.com)

394 points tomduncalf | 5 comments | 17 Jun 24 21:51 UTC | HN request time: 0.02s | source

Show context

rgbrgb ◴[17 Jun 24 22:58 UTC] No.40712154[source]▶

>>40711484 (OP) #

> 50% accuracy on the public test set for ARC-AGI by having GPT-4o

Isn't the public test set public on github and therefore GPT-4o trained on it?

replies(2): >>40712401 #>>40712472 #

1. bongodongobob ◴[17 Jun 24 23:27 UTC] No.40712401[source]▶

>>40712154 #

I keep seeing this comment all over the place. Just because something exists 1 time in the training data doesn't mean it can just regurgitate that. That's not how training works. An LLM is not a knowledge database.

replies(2): >>40712453 #>>40713177 #

2. adroniser ◴[17 Jun 24 23:34 UTC] No.40712453[source]▶

>>40712401 (TP) #

And yet it doesn't rule out that it can't. See new york times lawsuit

replies(1): >>40712544 #

3. bongodongobob ◴[17 Jun 24 23:44 UTC] No.40712544[source]▶

>>40712453 #

From old pieces of articles that are quoted all over the internet? That's not surprising.

replies(1): >>40714639 #

4. spencerchubb ◴[18 Jun 24 01:30 UTC] No.40713177[source]▶

>>40712401 (TP) #

It could exist many times. People can fork and clone the repo. People are likely to copy the examples and share them online.

5. ben_w ◴[18 Jun 24 06:27 UTC] No.40714639{3}[source]▶

>>40712544 #

That's still sufficient for both The Times and for it to be a potential problem in this case.

↑

Getting 50% (SoTA) on Arc-AGI with GPT-4o