←back to thread

108 points bertman | 2 comments | | HN request time: 0s | source
Show context
n4r9 ◴[] No.43819695[source]
Although I'm sympathetic to the author's argument, I don't think they've found the best way to frame it. I have two main objections i.e. points I guess LLM advocates might dispute.

Firstly:

> LLMs are capable of appearing to have a theory about a program ... but it’s, charitably, illusion.

To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

Secondly:

> Theories are developed by doing the work and LLMs do not do the work

Isn't this a little... anthropocentric? That's the way humans develop theories. In principle, could a theory not be developed by transmitting information into someone's brain patterns as if they had done the work?

replies(6): >>43819742 #>>43821151 #>>43821318 #>>43822444 #>>43822489 #>>43824220 #
Jensson ◴[] No.43821151[source]
> To make this point stick, you would also have to show why it's not an illusion when humans "appear" to have a theory.

Human theory building works, we have demonstrated this, our science letting us build things on top of things proves it.

LLM theory building so far doesn't, they always veer in a wrong direction after a few steps, you will need to prove that LLM can build theories just like we proved that humans can.

replies(3): >>43821344 #>>43821401 #>>43822289 #
jerf ◴[] No.43821344[source]
You can't prove LLMs can build theories like humans can, because we can effectively prove they can't. Most code bases do not fit in a context window. And any "theory" an LLM might build about a code base, analogously to the recent reasoning models, itself has to carve a chunk out of the context window, at what would have to be a fairly non-trivial percentage expansion of tokens versus the underlying code base, and there's already not enough tokens. There's no way that is big enough to build a theory of a code base.

"Building a theory" is something I expect the next generation of AIs to do, something that has some sort of memory that isn't just a bigger and bigger context window. As I often observe, LLMs != AI. The fact that an LLM by its nature can't build a model of a program doesn't mean that some future AI can't.

replies(1): >>43821423 #
1. imtringued ◴[] No.43821423[source]
This is correct. The model context is a form of short term memory. It turns out LLMs have an incredible short term memory, but simultaneously that is all they have.

What I personally find perplexing is that we are still stuck at having a single context window. Everyone knows that turing machines with two tapes require significantly fewer operations than a single tape turning machine that needs to simulate multiple tapes.

The reasoning stuff should be thrown into a separate context window that is not subject to training loss (only the final answer).

replies(1): >>43828930 #
2. fouc ◴[] No.43828930[source]
Or have at least 2 models. Each with their own dedicated context.