Francois Chollet is leaving Google

(developers.googleblog.com)

377 points xnx | 2 comments | 13 Nov 24 22:28 UTC | HN request time: 0.412s | source

Show context

max_ ◴[13 Nov 24 23:19 UTC] No.42131308[source]▶

>>42130881 (OP) #

I wonder what he will be working on?

Maybe he figured out a model that beats ARC-AGI by 85%?

replies(1): >>42131784 #

trott ◴[14 Nov 24 00:26 UTC] No.42131784[source]▶

>>42131308 #

> Maybe he figured out a model that beats ARC-AGI by 85%?

People have, I think.

One of the published approaches (BARC) uses GPT-4o to generate a lot more training data.

The approach is scaling really well so far [1], and whether you expect linear scaling or exponential one [2], the 85% threshold can be reached, using the "transduction" model alone, after generating under 2 million tasks ($20K in OpenAI credits).

Perhaps for 2025, the organizers will redesign ARC-AGI to be more resistant to this sort of approach, somehow.

---

[1] https://www.kaggle.com/competitions/arc-prize-2024/discussio...

[2] If you are "throwing darts at a board", you get exponential scaling (the probability of not hitting bullseye at least once reduces exponentially with the number of throws). If you deliberately design your synthetic dataset to be non-redundant, you might get something akin to linear scaling (until you hit perfect accuracy, of course).

replies(4): >>42131848 #>>42132132 #>>42132502 #>>42132655 #

1. mxwsn ◴[14 Nov 24 02:19 UTC] No.42132502[source]▶

>>42131784 #

My interest was piqued, but the extrapolation in [1] is uh... not the most convincing. If there were more data points then sure, maybe

replies(1): >>42132594 #

2. trott ◴[14 Nov 24 02:38 UTC] No.42132594[source]▶

>>42132502 (TP) #

The plot was just showing where the solid lines were trending (see prior messages), and that happened to predict the performance at 400k samples (red dot) very well.

An exponential scaling curve would steer a bit more to the right, but it would still cross the 85% mark before 2000k.

↑