Most active commenters

aithrowawaycomm(3)
j_bum(3)
myownpetard(3)

Popular/hot comments

>>42061007 #
>>42062463 #

←back to thread

The deep learning boom caught almost everyone by surprise

(www.understandingai.org)

1. aithrowawaycomm ◴[06 Nov 24 12:00 UTC] No.42060762[source]▶

>>42057139 (OP) #

I think there is a slight disconnect here between making AI systems which are smart and AI systems which are useful. It’s a very old fallacy in AI: pretending tools which assist human intelligence by solving human problems must themselves be intelligent.

The utility of big datasets was indeed surprising, but that skepticism came about from recognizing the scaling paradigm must be a dead end: vertebrates across the board require less data to learn new things, by several orders of magnitude. Methods to give ANNs “common sense” are essentially identical to the old LISP expert systems: hard-wiring the answers to specific common-sense questions in either code or training data, even though fish and lizards can rapidly make common-sense deductions about manmade objects they couldn’t have possibly seen in their evolutionary histories. Even spiders have generalization abilities seemingly absent in transformers: they spin webs inside human homes with unnatural geometry.

Again it is surprising that the ImageNet stuff worked as well as it did. Deep learning is undoubtedly a useful way to build applications, just like Lisp was. But I think we are about as close to AGI as we were in the 80s, since we have made zero progress on common sense: in the 80s we knew Big Data can poorly emulate common sense, and that’s where we’re at today.

replies(5): >>42061007 #>>42061232 #>>42068100 #>>42068802 #>>42070712 #

2. j_bum ◴[06 Nov 24 12:17 UTC] No.42061007[source]▶

>>42060762 (TP) #

> vertebrates across the board require less data to learn new things, by several orders of magnitude.

Sometimes I wonder if it’s fair to say this.

Organisms have had billions of years of training. We might come online and succeed in our environments with very little data, but we can’t ignore the information that’s been trained into our DNA, so to speak.

What’s billions of years of sensory information that drove behavior and selection, if not training data?

replies(10): >>42062463 #>>42064030 #>>42064183 #>>42064895 #>>42068159 #>>42070063 #>>42071450 #>>42075819 #>>42078291 #>>42085475 #

3. rjsw ◴[06 Nov 24 12:32 UTC] No.42061232[source]▶

>>42060762 (TP) #

Maybe we just collectively decided that it didn't matter whether the answer was correct or not.

replies(1): >>42062623 #

4. aithrowawaycomm ◴[06 Nov 24 14:05 UTC] No.42062463[source]▶

>>42061007 #

My primary concern is the generalization to manmade things that couldn’t possibly be in the evolutionary “training data.” As a thought experiment, it seems very plausible that you can train a transformer ANN on spiderwebs between trees, rocks, bushes, etc, and get “superspider” performance (say in a computer simulation). But I strongly doubt this will generalize to building webs between garages and pantries like actual spiders, no matter how many trees you throw at it, so such a system wouldn’t be ASI.

This extends to all sorts of animal cognitive experiments: crows understand simple pulleys simply by inspecting them, but they couldn’t have evolved to use pulleys. Mice can quickly learn that hitting a button 5 times will give them a treat: does it make sense to say that they encountered a similar situation in their evolutionary past? It makes more sense to suppose that mice and crows have powerful abilities to reason causally about their actions. These abilities are more sophisticated than mere “Pavlovian” associative reasoning, which is about understanding stimuli. With AI we can emulate associative reasoning very well because we have a good mathematical framework for Pavlovian responses as a sort of learning of correlations. But causal reasoning is much more mysterious, and we are very far from figuring out a good mathematical formalism that a computer can make sense of.

I also just detest the evolution = training data metaphor because it completely ignores architecture. Evolution is not just glomming on data, it’s trying different types of neurons, different connections between them, etc. All organisms alive today evolved with “billions of years of training,” but only architecture explains why we are so much smarter than chimps. In fact I think the “evolution” preys on our misconception that humans are “more evolved” than chimps, but our common ancestor was more primitive than a chimp.

replies(4): >>42066570 #>>42067131 #>>42072226 #>>42075842 #

5. aithrowawaycomm ◴[06 Nov 24 14:17 UTC] No.42062623[source]▶

>>42061232 #

Again I do think these things have utility and the unreliability of LLMs is a bit incidental here. Symbolic systems in LISP are highly reliable, but they couldn’t possibly be extended to AGI without another component, since there was no way to get the humans out of the loop: someone had to assign the symbols semantic meaning and encode the LISP function accordingly. I think there’s a similar conceptual issue with current ANNs, and LLMs in particular: they rely on far too much formal human knowledge to get off the ground.

replies(2): >>42062668 #>>42065736 #

6. rjsw ◴[06 Nov 24 14:20 UTC] No.42062668{3}[source]▶

>>42062623 #

I meant more why the "boom caught almost everyone by surprise", people working in the field thought that correct answers would be important.

7. RaftPeople ◴[06 Nov 24 15:44 UTC] No.42064030[source]▶

>>42061007 #

> Organisms have had billions of years of training. We might come online and succeed in our environments with very little data, but we can’t ignore the information that’s been trained into our DNA, so to speak

It's not just information (e.g. sets of innate smells and response tendencies), but it's also all of the advanced functions built into our brains (e.g. making sense of different types of input, dynamically adapting the brain to conditions, etc.).

8. lubujackson ◴[06 Nov 24 15:53 UTC] No.42064183[source]▶

>>42061007 #

Good point. And don't forget the dynamically changing environment responding with a quick death for any false path.

Like how good would LLMs be if their training set was built by humans responding with an intelligent signal at every crossroads.

9. SiempreViernes ◴[06 Nov 24 16:37 UTC] No.42064895[source]▶

>>42061007 #

This argument mostly just hollows out the meaning of training: evolution gives you things like arms and ears, but if you say evolution is like training you imply that you could have grown a new kind of arm in school.

replies(1): >>42065233 #

10. horsawlarway ◴[06 Nov 24 16:57 UTC] No.42065233{3}[source]▶

>>42064895 #

Training an LLM feels almost exactly like evolution - the gradient is "ability to procreate" and we're selecting candidates from related, randomized genetic traits and iterating the process over and over and over.

Schooling/education feels much more like supervised training and reinforcement (and possibly just context).

I think it's dismissive to assume that evolution hasn't influenced how well you're able to pick up new behavior, because it's highly likely it's not entirely novel in the context of your ancestry, and the traits you have that have been selected for.

11. nxobject ◴[06 Nov 24 17:25 UTC] No.42065736{3}[source]▶

>>42062623 #

Barring a stunning discovery that will stop putting the responsibility for NN intelligence on synthetic training set – it looks like NN and symbolic AI may have to coexist, symbiotically.

12. visarga ◴[06 Nov 24 18:10 UTC] No.42066570{3}[source]▶

>>42062463 #

I don't think "humans/animals learn faster" holds. LLMs learn new things on the spot, you just explain it in the prompt and give an example or two.

A recent paper tested both linguists and LLMs at learning a language with less than 200 speakers and therefore virtually no presence on the web. All from a few pages of explanations. The LLMs come close to humans.

https://arxiv.org/abs/2309.16575

Another example is the ARC-AGI benchmark, where the model has to learn from a few examples to derive the rule. AI models are closing the gap to human level, they are around 55% while humans are at 80%. These tests were specifically designed to be hard for models and easy for humans.

Besides these examples of fast learning, I think the other argument about humans benefiting from evolution is also essential here. Similarly, we can't beat AlphaZero at Go, as it evolved its own Go culture and plays better than us. Evolution is powerful.

13. car ◴[06 Nov 24 18:40 UTC] No.42067131{3}[source]▶

>>42062463 #

It’s all in the architecture. Also, biological neurons are orders of magnitude more complex than NN’s. There’s a plethora of neurotransmitters and all kinds of cellular machinery for dealing with signals (inhibitory, excitatory etc.).

replies(1): >>42076881 #

14. spencerchubb ◴[06 Nov 24 19:41 UTC] No.42068100[source]▶

>>42060762 (TP) #

> vertebrates across the board require less data to learn new things

the human brain is absolutely inundated with data, especially from visual, audio, and kinesthetic mediums. the data is a very different form than what one would use to train a CNN or LLM, but it is undoubtedly data. newborns start out literally being unable to see, and they have to develop those neural pathways by taking in the "pixels" of the world for every millisecond of every day

15. ◴[06 Nov 24 19:44 UTC] No.42068159[source]▶

>>42061007 #

16. kirkules ◴[06 Nov 24 20:26 UTC] No.42068802[source]▶

>>42060762 (TP) #

Do you have, offhand, any names or references to point me toward why you think fish and lizards can make rapid common sense deductions about man made objects they couldn't have seen in their evolutionary histories?

Also, separately, I'm only assuming but it seems the reason you think these deductions are different from hard wired answers if that their evolutionary lineage can't have had to make similar deductions. If that's your reasoning, it makes me wonder if you're using a systematic description of decisions and of the requisite data and reasoning systems to make those decisions, which would be interesting to me.

17. marcosdumay ◴[06 Nov 24 21:53 UTC] No.42070063[source]▶

>>42061007 #

> but we can’t ignore the information that’s been trained into our DNA

There's around 600MB in our DNA. Subtract this from the size of any LLM out there and see how much you get.

replies(1): >>42072096 #

18. aleph_minus_one ◴[06 Nov 24 22:44 UTC] No.42070712[source]▶

>>42060762 (TP) #

> I think there is a slight disconnect here between making AI systems which are smart and AI systems which are useful. It’s a very old fallacy in AI: pretending tools which assist human intelligence by solving human problems must themselves be intelligent.

I have difficulties understanding why you could even believe in such a fallacy: just look around you: most jobs that have to be done require barely any intelligence, and on the other hand, there exist few jobs that do require an insane amount of intelligence.

19. outworlder ◴[06 Nov 24 23:52 UTC] No.42071450[source]▶

>>42061007 #

Difficult to compare, not only neurons are vastly more complex, but the neural networks change and adapt. That's like if GPUs were not only programmed by software, but the hardware could also be changed based on the training data (like more sophisticated FPGAs).

Our DNA also stores a lot of information, but it is not that much.

Our dogs can learn about things such as vehicles that they have not been exposed to nearly enough, evolution wide. And so do crows, using cars to crack nuts and then waiting for red lights. And that's completely unsupervised.

We have a long way to go.

replies(1): >>42072768 #

20. myownpetard ◴[07 Nov 24 01:04 UTC] No.42072096{3}[source]▶

>>42070063 #

A more fair comparison would be subtract it from the size the of source code required to represent the LLM.

replies(2): >>42072476 #>>42072730 #

21. myownpetard ◴[07 Nov 24 01:21 UTC] No.42072226{3}[source]▶

>>42062463 #

Evolution is the heuristic search for effective neural architectures. It is training data, but for the meta-search for effective architectures, which gets encoded in our DNA.

Then we compile and run that source code and our individual lived experience is the training data for the instantiation of that architecture, e.g. our brain.

It's two different but interrelated training/optimization processes.

22. nick3443 ◴[07 Nov 24 02:01 UTC] No.42072476{4}[source]▶

>>42072096 #

More like the source code AND the complete design for a 200+ degree of freedom robot with batteries etc. pretty amazing.

It's like a 600mb demoscene demo for Conway's game of life!

replies(1): >>42073032 #

23. marcosdumay ◴[07 Nov 24 02:34 UTC] No.42072730{4}[source]▶

>>42072096 #

The source code is the weights. That's what they learn.

replies(1): >>42073049 #

24. klipt ◴[07 Nov 24 02:40 UTC] No.42072768{3}[source]▶

>>42071450 #

You say "unsupervised" but crows are learning with feedback from the physical world.

Young crows certainly learn: hitting objects is painful. Avoiding objects avoids the pain.

From there, learning that red lights correlates with the large, fast, dangerous object stopping, is just a matter of observation.

replies(1): >>42087984 #

25. Terr_ ◴[07 Nov 24 03:22 UTC] No.42073032{5}[source]▶

>>42072476 #

That's underselling the product, a swarm of nanobots that are (literally, currently) beyond human understanding that are also the only way to construct certain materials and systems.

Inheritor of the Gray Goo apocalypse that covered the planet, this kind constructs an enormous mobile mega-fortress with a literal hive-mind, scouring the environment for raw materials and fending off hacking attempts by other nanobots. They even simulate other hive-minds to gain an advantage.

26. myownpetard ◴[07 Nov 24 03:26 UTC] No.42073049{5}[source]▶

>>42072730 #

I disagree. A neural network is not learning it's source code. The source code specifies the model structure and hyperparameters. Then it compiled and instantiated into some physical medium, usually a bunch of GPUs, and weights are learned.

Our DNA specifies the model structure and hyperparameters for our brains. Then it is compiled and instantiated into a physical medium, our bodies, and our connectome is trained.

If you want to make a comparison about the quantity of information contained in different components of an artificial and a biological system, then it only makes sense if you compare apples to apples. DNA:Code :: Connectome:Weights

27. YeGoblynQueenne ◴[07 Nov 24 11:44 UTC] No.42075819[source]▶

>>42061007 #

>> Organisms have had billions of years of training.

You're referring to evolution but evolution is not optimising an objective function over a large set of data (labelled, too). Evolution proceeds by random mutation. And just because an ancestral form has encountered e.g. ice and knows what that is, doesn't mean that its evolutionary descendants retain the memory of ice and know what that is because of that memory.

tl;dr evolution and machine learning are radically different processes and it doesn't make a lot of sense to say that organisms have "trained" for millions of years. They haven't! They've evolved for millions of years.

>> What’s billions of years of sensory information that drove behavior and selection, if not training data?

That's not how it works: organisms don't train on data. They adapt to environments. Very different things.

28. YeGoblynQueenne ◴[07 Nov 24 11:49 UTC] No.42075842{3}[source]▶

>>42062463 #

>> But causal reasoning is much more mysterious, and we are very far from figuring out a good mathematical formalism that a computer can make sense of.

I agree with everything else you've said to a surprising degree (if I say the same things myself down the line I swear I'm not plagiarising you) but the above statement is not right: we absolutely know how to do deductive reasoning from data. We have powerful deductive inference approaches: search and reasoning algorithms, Resolution the major among them.

What we don't have is a way to use those algorithms without a formal language or a structured object in which to denote the inputs and outputs. E.g. with Resolution you need logic formulae in clausal form, for search you need a graph etc. Animals don't need that and can reason from raw sensory data.

Anyway we know how to do reasoning, not just learning; but the result of my doctoral research is that both are really one and what statistical machine learning is missing is a bridge between the two.

29. datameta ◴[07 Nov 24 14:25 UTC] No.42076881{4}[source]▶

>>42067131 #

Right - there is more inherent non-linearity in the fundamental unit of our architecture which leads to higher possible information complexity.

30. Salgat ◴[07 Nov 24 16:44 UTC] No.42078291[source]▶

>>42061007 #

When you say billions of years, you have to remember that change in DNA is glacial compared to computing; we're talking the equivalent of years or even decades for a single training iteration to occur. Deep learning models on the other hand experience millions of these in a matter of a month, and each iteration is exposed to what would take a human thousands of lifetimes to be exposed to.

replies(1): >>42082947 #

31. ab5tract ◴[08 Nov 24 00:51 UTC] No.42082947{3}[source]▶

>>42078291 #

DNA literally changes inside of a human within a single lifetime.

It didn't take a thousand years for moths to turn grey during the industrial revolution.

replies(1): >>42084304 #

32. Salgat ◴[08 Nov 24 04:59 UTC] No.42084304{4}[source]▶

>>42082947 #

Remember we're talking about the human race (and its ancestors) as a whole adopting the mutations that are successful.

33. loa_in_ ◴[08 Nov 24 09:13 UTC] No.42085475[source]▶

>>42061007 #

I also think this is a lazy claim. We have so so many internal sources of information like the feeling of temperature or vestibular system reacting to anything from an inclination change to effective power output of heart in real time every second of the day.

replies(1): >>42087438 #

34. j_bum ◴[08 Nov 24 15:12 UTC] No.42087438{3}[source]▶

>>42085475 #

That’s a fair point. But to push back, how many sources of sensory information are needed for cognition to arise in humans?

I would be willing to bet that hearing or vision alone would be sufficient to develop cognition. Many of these extra senses are beneficial for survival, but not required for cognition. E.g., we don’t need smell/touch/taste/pain to think.

Thoughts?

replies(1): >>42094948 #

35. RaftPeople ◴[08 Nov 24 16:17 UTC] No.42087984{4}[source]▶

>>42072768 #

> From there, learning that red lights correlates with the large, fast, dangerous object stopping, is just a matter of observation

I think "just a matter of observation" understates the many levels of abstraction and generalization that animal brains have evolved to effectively deal with the environment.

Here's something I just read the other day about this:

Summary: https://medicalxpress.com/news/2024-11-neuroscientists-revea...

Actual: https://www.nature.com/articles/s41586-024-08145-x

"After experiencing enough sequences, the mice did something remarkable—they guessed a part of the sequence they had never experienced before. When reaching D in a new location for the first time, they knew to go straight back to A. This action couldn't have been remembered, since it was never experienced in the first place! Instead, it's evidence that mice know the general structure of the task and can track their 'position' in behavioral coordinates"

36. krschacht ◴[09 Nov 24 15:35 UTC] No.42094948{4}[source]▶

>>42087438 #

I think we need the other senses for cognition. The other senses are part of the reward function which the cognitive learning algorithms optimize for. Pleasure and pain, and joy and suffering, guide the cognitive development process.

replies(1): >>42095065 #

37. j_bum ◴[09 Nov 24 15:55 UTC] No.42095065{5}[source]▶

>>42094948 #

I think you’re starting to conflate emotion with senses.

Yes pain is a form of sensory experience, but it also has affective/emotional components that can be experienced even without the presence of noxious stimuli.

However, there are people that don’t experience pain (congenital insensitivity to pain), which is caused by mutations in the NaV1.7 channel, or in one or more of the thermo/chemo/mechanotransducers that encode noxious stimuli into neural activity.

And obviously, these people who don’t experience the sensory discriminative components of pain are still capable of cognition.

To steelman your argument, I do agree that lacking all but one of what I would call the sufficient senses for cognition would dramatically slow down the rate of cognitive development. But I don’t think they would prohibit it.

↑