←back to thread

Interview with gwern

(www.dwarkeshpatel.com)
308 points synthmeat | 8 comments | | HN request time: 0.001s | source | bottom
Show context
YeGoblynQueenne ◴[] No.42135916[source]
This will come across as vituperative and I guess it is a bit but I've interacted with Gwern on this forum and the interaction that has stuck to me is in this thread, where Gwern mistakes a^nb^n as a regular (but not context-free) language (and calls my comment "not even wrong"):

https://news.ycombinator.com/item?id=21559620

Again I'm sorry for the negativity, but already at the time Gwern was held up by a certain, large, section of the community as an important influencer in AI. For me that's just a great example of how basically the vast majority of AI influencers (who vie for influence on social media, rather than research) are basically clueless about AI and CS and only have second-hand knowledge, which I guess they're good at organising and popularising, but not more than that. It's easy to be a cheer leader for the mainstream view on AI. The hard part is finding, and following, unique directions.

With apologies again for the negative slant of the comment.

replies(10): >>42136055 #>>42136148 #>>42136538 #>>42136759 #>>42137041 #>>42137215 #>>42137274 #>>42137284 #>>42137350 #>>42137636 #
dilap ◴[] No.42136759[source]
Regarding your linked comment, my takeaway is that the very theoretical task of being able to recognize an infinite language isn't very relevent to the non-formal, intuitive idea of "intelligence"

Transformers can easily intellectually understand a^nb^n, even though they couldn't recognize whether an arbitrarily long string is a member of the language -- a restriction humans share!, since eventually a human, too, would lose track of the count, for a long enough string.

replies(2): >>42136846 #>>42136925 #
1. YeGoblynQueenne ◴[] No.42136846[source]
I don't know what "intellectually understand" means in the context of Transformers. My older comment was about the ability of neural nets to learn automata from examples, a standard measure of the learning ability of a machine learning system. I link to a paper below where Transformers and RNNs are compared on their ability to learn automata along the entire Chomsky hierarchy and as other work has also shown, they don't do that well (although there are some surprising surprises).

>> Regarding your linked comment, my takeaway is that the very theoretical task of being able to recognize an infinite language isn't very relevent to the non-formal, intuitive idea of "intelligence"

That depends on who you ask. My view is that automata are relevant to computation and that's why we study them in computer science. If we were biologists, we would study beetles. The question is whether computation , as we understand it on the basis of computer science, has anything to do with intelligence. I think it does, but that it's not the whole shebang. There is a long debate on that in AI and the cognitive sciences and the jury is still out, despite what many of the people working on LLMs seem to believe.

replies(2): >>42137144 #>>42137319 #
2. Vecr ◴[] No.42137144[source]
How do you do intelligence without computation though? Brains are semi-distributed analog computers with terrible interconnect speeds and latencies. Unless you think they're magic, any infinite language is still just a limit to them.

Edit: and technically you're describing what is more or less backprop learning, neural networks, by themselves, don't learn at all.

replies(1): >>42137297 #
3. YeGoblynQueenne ◴[] No.42137297[source]
Yes, I'm talking about learning neural nets with gradient descent. See also the nice paper I linked below.

>> How do you do intelligence without computation though?

Beats me! Unlike everyone else in this space, it seems, I haven't got a clue how to do intelligence at all, with or without computation.

Edit: re infinite languages, I liked something Walid Saba (RIP) pointed out on Machine Learning Street Talk, that sure you can't generate infinite strings but if you have an infinite language every string accepted by the language has a uniform probability of one over infinity, so there's no way to learn the entire language by learning the distribution of strings within it. But e.g. the Python compiler must be able to recognise an infinite number of Python programs as valid (or reject those that aren't) because of the same reason, that it's impossible to predict which string is going to come out of a source generating strings in an infinite language. So you have to able to deal with infinite possibilities, with only finite resources.

Now, I think there's a problem with that. Assuming a language L has a finite alphabet, even if L is infinite (i.e. it includes an infinite number of strings) the subset of L where strings only go up to some length n is going to be finite. If that n is large enough that it is just beyond the computational resources of any system that has to recognise strings in L (like a compiler) then any system that can recognise, or generate, all strings in L up to n length, will be, for all intents and purposes, complete with respect to L, up to n etc. In plain English, the Python compiler doesn't need to be able to deal with Python programs of infinite length, so it doesn't need to deal with an infinite number of Python programs.

Same for natural language. The informal proof of the infinity of natural language I know of is based on the observation that we can embed an arbitrary number of sentences in other sentences: "Mary, whom we met in the summer, in Fred's house, when we went there with George... " etc. But, in practice, that ability too will be limited by time and human linguistic resources, so not even the human linguistic ability really-really needs to be able to deal with an infinite number of strings.

That's assuming that natural language has a finite alphabet, or I guess lexicon is the right word. That may or may not be the case: we seem to be able to come up with new rods all the time. Anyway some of this may explain why LLMs can still convincingly reproduce the structure of natural language without having to train on infinite examples.

replies(1): >>42137493 #
4. dilap ◴[] No.42137319[source]
By intellectually understand, I just mean you can ask Claude or ChatGPT or whatever, "how can I recognize if a string is in a^n b^n? what is the language being described?" and it can easily tell you; if you were giving it an exam, it would pass.

(Of course, maybe you could argue that's a famous example in its training set and it's just regurgitating, but then you could try making modifications, asking other questions, etc, and the LLM would continue to respond sensibly. So to me it seems to understand...)

Or going back to the original Hofstadter article, "simple tests show that [machine translation is] a long way from real understanding"; I tried rerunning the first two of these simple tests today w/ Claude 3.5 Sonnet (new), and it absolutely nails them. So it seems to understand the text quite well.

Regarding computation and understanding: I just though it was interesting that you presented a true fact about the computational limitations of NNs, which could easily/naturally/temptingingly -- yet incorrectly (I think!) -- be extended into a statement about the limitations of understanding of NNs (whatever understanding means -- no technical definition that I know of, but still, it does mean something, right?).

replies(1): >>42137974 #
5. Vecr ◴[] No.42137493{3}[source]
What I don't know how to do is bounded rationality. Iterating over all the programs weighted by length (with dovetailing if you're a stickler) is "easy", but won't ever get anywhere.

And you can't get away with the standard tricky tricks that people use to say it isn't easy, logical induction exists.

replies(1): >>42137745 #
6. YeGoblynQueenne ◴[] No.42137745{4}[source]
Right! See my (long) edit.
7. YeGoblynQueenne ◴[] No.42137974[source]
>> (Of course, maybe you could argue that's a famous example in its training set and it's just regurgitating, but then you could try making modifications, asking other questions, etc, and the LLM would continue to respond sensibly. So to me it seems to understand...)

Yes, well, that's the big confounder that has to be overcome by any claim of understanding (or reasoning etc) by LLMs, isn't it? They've seen so much stuff in training that it's very hard to know what they're simply reproducing from their corpus and what not. My opinion is that LLMs are statistical models of text and we can expect them to learn the surface statistical regularities of text in their corpus, which can be very powerful, but that's all. I don't see how they can learn "understanding" from text. The null hypothesis should be that they can't and, Sagan-like, we should expect to see extraordinary evidence before accepting they can. I do.

>> Regarding computation and understanding: I just though it was interesting that you presented a true fact about the computational limitations of NNs, which could easily/naturally/temptingingly -- yet incorrectly (I think!) -- be extended into a statement about the limitations of understanding of NNs (whatever understanding means -- no technical definition that I know of, but still, it does mean something, right?).

For humans it means something- because understanding is a property we assume humans have. Sometimes we use it metaphorically ("my program understands when the customer wants to change their pants") but in terms of computation... again I have no clue.

I generally have very few clues :)

replies(1): >>42141167 #
8. dilap ◴[] No.42141167{3}[source]
Personally I am convinced LLMs do have real understanding, because they seem to respond in interesting and thoughtfull ways to anything I care to talk to them about, well outside of any topic I would expect to be captured statistically! (Indeed, I often find it easier to get LLMs to understand me than many humans. :-)

There's also stuff like the Golden Gate Claude experiment and research @repligate shares on twitter, which again make me think understanding (as I conceive of it) is definitely there.

Now, are the conscious, feeling entities? That is a harder question to answer...