(rishimehta.xyz)

250 points rishicomplex | 5 comments | 17 Nov 24 17:20 UTC | HN request time: 0.91s | source

1. chompychop ◴[18 Nov 24 09:31 UTC] No.42170989[source]▶

Is it currently possible to reliably limit the cut-off knowledge of an LLM (either during training or inference)? An interesting experiment would be to feed an LLM mathematical knowledge only up to the year of proving a theorem, and then see if it can actually come up with the novel techniques used in the proof. For example, having only access to papers prior to 1993, can an LLM come up with Wiles' proof of FLT?

replies(2): >>42171207 #>>42171222 #

2. ogrisel ◴[18 Nov 24 10:25 UTC] No.42171207[source]▶

>>42170989 (TP) #

That should be doable, e.g. by semi-automated curation of the pre-training dataset. However, since curating such large datasets and running pre-training runs is so expensive, I doubt that anybody will run such an experiment. Especially since would have to trust that the curation process was correct enough for the end-result to be meaningful. Checking that the curation process is not flawed is probably as expensive as running it in the first place.

3. n4r9 ◴[18 Nov 24 10:30 UTC] No.42171222[source]▶

>>42170989 (TP) #

There's the Frontier Math benchmarks [0] demonstrating that AI is currently quite far from human performance at research-level mathematics.

[0] https://arxiv.org/abs/2411.04872

replies(1): >>42177617 #

4. data_maan ◴[18 Nov 24 22:04 UTC] No.42177617[source]▶

>>42171222 #

They didn't demonstrate anything. They haven't even released their dataset, nor mentioned how big it is.

It's just hot air, just like the AlphaProof announcement, where very little is know about their system.

replies(1): >>42181992 #

5. n4r9 ◴[19 Nov 24 10:41 UTC] No.42181992{3}[source]▶

>>42177617 #

They won't publish the problem set for obvious reasons. And I doubt it's hot air, given the mathematicians involved in creating it.

↑

AlphaProof's Greatest Hits