←back to thread

60 points QueensGambit | 2 comments | | HN request time: 0.548s | source
Show context
daxfohl ◴[] No.45686184[source]
I've seen research that shows that starting with reasoning models, and fine-tuning to slowly remove the reasoning steps, allows you to bake the reasoning directly into the model weights in a strong sense. Here's a recent example, and you can see the digits get baked into a pentagonal prism in the weights, allowing accurate multi-digit multiplication without needing notes: https://arxiv.org/abs/2510.00184. So, reasoning and tool use could be the first step, to collect a ton of training data to do something like this fine-tuning process.
replies(3): >>45686449 #>>45687650 #>>45688521 #
1. nightmunnas ◴[] No.45686449[source]
This is what I deem to be more comparable to human reasoning, although in this case it happens at an extremely slow timescale. Ideally real reasoning would have an impact on the weights, although that would be practically impossible(very impractical with the current model architectures) and especially if the conversation is had with the broader public.
replies(1): >>45688004 #
2. daxfohl ◴[] No.45688004[source]
Yeah, kind of. Though it seems like humans do some post-analysis on the reasoning and kind of "root cause" the path to the result. Fine tuning on the raw thought output seems like it could capture a ton of unnecessary noise. I don't know if it would be able to capture the "aha" moment and encode that alone in a way that makes sense.