←back to thread

60 points QueensGambit | 7 comments | | HN request time: 0.001s | source | bottom
1. daxfohl ◴[] No.45686184[source]
I've seen research that shows that starting with reasoning models, and fine-tuning to slowly remove the reasoning steps, allows you to bake the reasoning directly into the model weights in a strong sense. Here's a recent example, and you can see the digits get baked into a pentagonal prism in the weights, allowing accurate multi-digit multiplication without needing notes: https://arxiv.org/abs/2510.00184. So, reasoning and tool use could be the first step, to collect a ton of training data to do something like this fine-tuning process.
replies(3): >>45686449 #>>45687650 #>>45688521 #
2. nightmunnas ◴[] No.45686449[source]
This is what I deem to be more comparable to human reasoning, although in this case it happens at an extremely slow timescale. Ideally real reasoning would have an impact on the weights, although that would be practically impossible(very impractical with the current model architectures) and especially if the conversation is had with the broader public.
replies(1): >>45688004 #
3. QueensGambit ◴[] No.45687650[source]
That's interesting, though I wonder if what's "baked into the weights" is closer to intuition than reasoning. Once reasoning traces are distilled into weights, the model stops thinking through problems and starts pattern-matching answers. That feels more like a stochastic parrot with intuition than an analytical reasoner.
replies(1): >>45687855 #
4. daxfohl ◴[] No.45687855[source]
I'd guess it works like any form of learning a subject. The better you have internalized the fundamentals, the better you can perform at higher level tasks.

Though to that end, I wonder if the model "knows" that it "understands" the fundamentals better once it's been trained like this, or if when it has to do a large multiplication as part of a larger reasoning task, does it still break it down step by step.

5. daxfohl ◴[] No.45688004[source]
Yeah, kind of. Though it seems like humans do some post-analysis on the reasoning and kind of "root cause" the path to the result. Fine tuning on the raw thought output seems like it could capture a ton of unnecessary noise. I don't know if it would be able to capture the "aha" moment and encode that alone in a way that makes sense.
6. photonthug ◴[] No.45688521[source]
Glad to see the pentagonal multiplication prism is just as weird as the addition helix https://arxiv.org/abs/2502.00873
replies(1): >>45689060 #
7. daxfohl ◴[] No.45689060[source]
Yeah, I have to imagine animal brains are just giant Fourier transform engines under the hood, and humans brains have just evolved to make some frequencies more precise.