The deep learning boom caught almost everyone by surprise

(www.understandingai.org)

Show context

aithrowawaycomm ◴[06 Nov 24 12:00 UTC] No.42060762[source]▶

>>42057139 (OP) #

I think there is a slight disconnect here between making AI systems which are smart and AI systems which are useful. It’s a very old fallacy in AI: pretending tools which assist human intelligence by solving human problems must themselves be intelligent.

The utility of big datasets was indeed surprising, but that skepticism came about from recognizing the scaling paradigm must be a dead end: vertebrates across the board require less data to learn new things, by several orders of magnitude. Methods to give ANNs “common sense” are essentially identical to the old LISP expert systems: hard-wiring the answers to specific common-sense questions in either code or training data, even though fish and lizards can rapidly make common-sense deductions about manmade objects they couldn’t have possibly seen in their evolutionary histories. Even spiders have generalization abilities seemingly absent in transformers: they spin webs inside human homes with unnatural geometry.

Again it is surprising that the ImageNet stuff worked as well as it did. Deep learning is undoubtedly a useful way to build applications, just like Lisp was. But I think we are about as close to AGI as we were in the 80s, since we have made zero progress on common sense: in the 80s we knew Big Data can poorly emulate common sense, and that’s where we’re at today.

replies(5): >>42061007 #>>42061232 #>>42068100 #>>42068802 #>>42070712 #

j_bum ◴[06 Nov 24 12:17 UTC] No.42061007[source]▶

>>42060762 #

> vertebrates across the board require less data to learn new things, by several orders of magnitude.

Sometimes I wonder if it’s fair to say this.

Organisms have had billions of years of training. We might come online and succeed in our environments with very little data, but we can’t ignore the information that’s been trained into our DNA, so to speak.

What’s billions of years of sensory information that drove behavior and selection, if not training data?

replies(10): >>42062463 #>>42064030 #>>42064183 #>>42064895 #>>42068159 #>>42070063 #>>42071450 #>>42075819 #>>42078291 #>>42085475 #

1. aithrowawaycomm ◴[06 Nov 24 14:05 UTC] No.42062463[source]▶

>>42061007 #

My primary concern is the generalization to manmade things that couldn’t possibly be in the evolutionary “training data.” As a thought experiment, it seems very plausible that you can train a transformer ANN on spiderwebs between trees, rocks, bushes, etc, and get “superspider” performance (say in a computer simulation). But I strongly doubt this will generalize to building webs between garages and pantries like actual spiders, no matter how many trees you throw at it, so such a system wouldn’t be ASI.

This extends to all sorts of animal cognitive experiments: crows understand simple pulleys simply by inspecting them, but they couldn’t have evolved to use pulleys. Mice can quickly learn that hitting a button 5 times will give them a treat: does it make sense to say that they encountered a similar situation in their evolutionary past? It makes more sense to suppose that mice and crows have powerful abilities to reason causally about their actions. These abilities are more sophisticated than mere “Pavlovian” associative reasoning, which is about understanding stimuli. With AI we can emulate associative reasoning very well because we have a good mathematical framework for Pavlovian responses as a sort of learning of correlations. But causal reasoning is much more mysterious, and we are very far from figuring out a good mathematical formalism that a computer can make sense of.

I also just detest the evolution = training data metaphor because it completely ignores architecture. Evolution is not just glomming on data, it’s trying different types of neurons, different connections between them, etc. All organisms alive today evolved with “billions of years of training,” but only architecture explains why we are so much smarter than chimps. In fact I think the “evolution” preys on our misconception that humans are “more evolved” than chimps, but our common ancestor was more primitive than a chimp.

replies(4): >>42066570 #>>42067131 #>>42072226 #>>42075842 #

2. visarga ◴[06 Nov 24 18:10 UTC] No.42066570[source]▶

>>42062463 (TP) #

I don't think "humans/animals learn faster" holds. LLMs learn new things on the spot, you just explain it in the prompt and give an example or two.

A recent paper tested both linguists and LLMs at learning a language with less than 200 speakers and therefore virtually no presence on the web. All from a few pages of explanations. The LLMs come close to humans.

https://arxiv.org/abs/2309.16575

Another example is the ARC-AGI benchmark, where the model has to learn from a few examples to derive the rule. AI models are closing the gap to human level, they are around 55% while humans are at 80%. These tests were specifically designed to be hard for models and easy for humans.

Besides these examples of fast learning, I think the other argument about humans benefiting from evolution is also essential here. Similarly, we can't beat AlphaZero at Go, as it evolved its own Go culture and plays better than us. Evolution is powerful.

3. car ◴[06 Nov 24 18:40 UTC] No.42067131[source]▶

>>42062463 (TP) #

It’s all in the architecture. Also, biological neurons are orders of magnitude more complex than NN’s. There’s a plethora of neurotransmitters and all kinds of cellular machinery for dealing with signals (inhibitory, excitatory etc.).

replies(1): >>42076881 #

4. myownpetard ◴[07 Nov 24 01:21 UTC] No.42072226[source]▶

>>42062463 (TP) #

Evolution is the heuristic search for effective neural architectures. It is training data, but for the meta-search for effective architectures, which gets encoded in our DNA.

Then we compile and run that source code and our individual lived experience is the training data for the instantiation of that architecture, e.g. our brain.

It's two different but interrelated training/optimization processes.

5. YeGoblynQueenne ◴[07 Nov 24 11:49 UTC] No.42075842[source]▶

>>42062463 (TP) #

>> But causal reasoning is much more mysterious, and we are very far from figuring out a good mathematical formalism that a computer can make sense of.

I agree with everything else you've said to a surprising degree (if I say the same things myself down the line I swear I'm not plagiarising you) but the above statement is not right: we absolutely know how to do deductive reasoning from data. We have powerful deductive inference approaches: search and reasoning algorithms, Resolution the major among them.

What we don't have is a way to use those algorithms without a formal language or a structured object in which to denote the inputs and outputs. E.g. with Resolution you need logic formulae in clausal form, for search you need a graph etc. Animals don't need that and can reason from raw sensory data.

Anyway we know how to do reasoning, not just learning; but the result of my doctoral research is that both are really one and what statistical machine learning is missing is a bridge between the two.

6. datameta ◴[07 Nov 24 14:25 UTC] No.42076881[source]▶

>>42067131 #

Right - there is more inherent non-linearity in the fundamental unit of our architecture which leads to higher possible information complexity.

↑