Most active commenters

jedberg(4)

Popular/hot comments

>>42139703 #
>>42139314 #
>>42141257 #

←back to thread

OpenAI, Google and Anthropic are struggling to build more advanced AI

(www.bloomberg.com)

Show context

irrational ◴[14 Nov 24 18:03 UTC] No.42139106[source]▶

>>42125888 (OP) #

> The AGI bubble is bursting a little bit

I'm surprised that any of these companies consider what they are working on to be Artificial General Intelligences. I'm probably wrong, but my impression was AGI meant the AI is self aware like a human. An LLM hardly seems like something that will lead to self-awareness.

replies(18): >>42139138 #>>42139186 #>>42139243 #>>42139257 #>>42139286 #>>42139294 #>>42139338 #>>42139534 #>>42139569 #>>42139633 #>>42139782 #>>42139855 #>>42139950 #>>42139969 #>>42140128 #>>42140234 #>>42142661 #>>42157364 #

1. jedberg ◴[14 Nov 24 18:08 UTC] No.42139186[source]▶

>>42139106 #

Whether self awareness is a requirement for AGI definitely gets more into the Philosophy department than the Computer Science department. I'm not sure everyone even agrees on what AGI is, but a common test is "can it do what humans can".

For example, in this article it says it can't do coding exercises outside the training set. That would definitely be on the "AGI checklist". Basically doing anything that is outside of the training set would be on that list.

replies(5): >>42139314 #>>42139671 #>>42139703 #>>42139946 #>>42141257 #

2. littlestymaar ◴[14 Nov 24 18:19 UTC] No.42139314[source]▶

>>42139186 (TP) #

> Whether self awareness is a requirement for AGI definitely gets more into the Philosophy department than the Computer Science department.

Depends on how you define “self awareness” but knowing that it doesn't know something instead of hallucinating a plausible-but-wrong is already self awareness of some kind. And it's both highly valuable and beyond current tech's capability.

replies(3): >>42139395 #>>42141680 #>>42141969 #

3. sharemywin ◴[14 Nov 24 18:24 UTC] No.42139395[source]▶

>>42139314 #

This is an interesting paper about hallucinations.

https://openai.com/index/introducing-simpleqa/

especially this section Using SimpleQA to measure the calibration of large language models

4. Filligree ◴[14 Nov 24 18:48 UTC] No.42139671[source]▶

>>42139186 (TP) #

Let me modify that a little, because humans can't do things outside their training set either.

A crucial element of AGI would be the ability to self-train on self-generated data, online. So it's not really AGI if there is a hard distinction between training and inference (though it may still be very capable), and it's not really AGI if it can't work its way through novel problems on its own.

The ability to immediately solve a problem it's never seen before is too high a bar, I think.

And yes, my definition still excludes a lot of humans in a lot of fields. That's a bullet I'm willing to bite.

replies(2): >>42140011 #>>42140807 #

5. norir ◴[14 Nov 24 18:50 UTC] No.42139703[source]▶

>>42139186 (TP) #

Here is an example of a task that I do not believe this generation of LLMs can ever do but that is possible for a human: design a Turing complete programming language that is both human and machine readable and implement a self hosted compiler in this language that self compiles on existing hardware faster than any known language implementation that also self compiles. Additionally, for any syntactically or semantically invalid program, the compiler must provide an error message that points exactly to the source location of the first error that occurs in the program.

I will get excited for/scared of LLMs when they can tackle this kind of problem. But I don't believe they can because of the fundamental nature of their design, which is both backward looking (thus not better than the human state of the art) and lacks human intuition and self awareness. Or perhaps rather I believe that the prompt that would be required to get an LLM to produce such a program is a problem of at least equivalent complexity to implementing the program without an LLM.

replies(4): >>42140363 #>>42141652 #>>42141654 #>>42145267 #

6. sourcepluck ◴[14 Nov 24 19:10 UTC] No.42139946[source]▶

>>42139186 (TP) #

Searle's Chinese Room Argument springs to mind:

  https://plato.stanford.edu/entries/chinese-room/

The idea that "human-like" behaviour will lead to self-awareness is both unproven (it can't be proven until it happens) and impossible to disprove (like Russell's teapot).

Yet, one common assumption of many people running these companies or investing in them, or of some developers investing their time in these technologies, is precisely that some sort of explosion of superintelligence is likely, or even inevitable.

It surely is possible, but stretching that to likely seems a bit much if you really think how imperfectly we understand things like consciousness and the mind.

Of course there are people who have essentially religious reactions to the notion that there may be limits to certain domains of knowledge. Nonetheless, I think that's the reality we're faced with here.

replies(1): >>42140395 #

7. lxgr ◴[14 Nov 24 19:15 UTC] No.42140011[source]▶

>>42139671 #

Are you arguing that writing, doing math, going to the moon etc. were all in the "original training set" of humans in some way?

replies(1): >>42140169 #

8. layer8 ◴[14 Nov 24 19:31 UTC] No.42140169{3}[source]▶

>>42140011 #

Not in the original training set (GP is saying), but the necessary skills became part of the training set over time. In other words, human are fine with the training set being a changing moving target, whereas ML models are to a significant extent “stuck” with their original training set.

(That’s not to say that humans don’t tend to lose some of their flexibility over their individual lifetimes as well.)

replies(1): >>42143746 #

9. Xenoamorphous ◴[14 Nov 24 19:46 UTC] No.42140363[source]▶

>>42139703 #

> Here is an example of a task that I do not believe this generation of LLMs can ever do but that is possible for a human

That’s possible for a highly intelligent, extensively trained, very small subset of humans.

replies(2): >>42140903 #>>42141088 #

10. abeppu ◴[14 Nov 24 19:49 UTC] No.42140395[source]▶

>>42139946 #

> The idea that "human-like" behaviour will lead to self-awareness is both unproven (it can't be proven until it happens) and impossible to disprove (like Russell's teapot).

I think Searle's view was that:

- while it cannot be dis-_proven_, the Chinese Room argument was meant to provide reasons against believing it

- the "it can't be proven until it happens" part is misunderstanding: you won't know if it happens because the objective, externally available attributes don't indicate whether self-awareness (or indeed awareness at all) is present

replies(1): >>42141503 #

11. HarHarVeryFunny ◴[14 Nov 24 20:26 UTC] No.42140807[source]▶

>>42139671 #

> Let me modify that a little, because humans can't do things outside their training set either.

That's not true. Humans can learn.

An LLM is just a tool. If it can't do what you want then too bad.

replies(1): >>42147539 #

12. hatefulmoron ◴[14 Nov 24 20:34 UTC] No.42140903{3}[source]▶

>>42140363 #

If you took the intersection of every human's abilities you'd be left with a very unimpressive set.

That also ignores the fact that the small set of humans capable of building programming languages and compilers is a consequence of specialization and lack of interest. There are plenty of humans that are capable of learning how to do it. LLMs, on the other hand, are both specialized for the task and aren't lazy or uninterested.

13. luckydata ◴[14 Nov 24 20:52 UTC] No.42141088{3}[source]▶

>>42140363 #

does it mean people that can build languages and compilers are not humans? What is the point you're trying to make?

replies(1): >>42141178 #

14. fragmede ◴[14 Nov 24 21:02 UTC] No.42141178{4}[source]▶

>>42141088 #

It means that's a really high bar for intelligence, human or otherwise. If AGI is "as good as a human, and the test is a trick task that most humans would fail at (especially considering the weasel requirement that it additionally has to be faster), why is that considered a reasonable bar for human-grade intelligence.

15. olalonde ◴[14 Nov 24 21:10 UTC] No.42141257[source]▶

>>42139186 (TP) #

I feel the test for AGI should be more like: "go find a job and earn money" or "start a profitable business" or "pick a bachelor degree and complete it", etc.

replies(3): >>42141334 #>>42141439 #>>42144147 #

16. rodgerd ◴[14 Nov 24 21:19 UTC] No.42141334[source]▶

>>42141257 #

An LLM doing crypto spam/scamming has been making money by tricking Marc Andressen into boosting it. So to the degree that "scamming gullible billionaires and their fans" is a job, that's been done.

replies(2): >>42141411 #>>42141664 #

17. rsanek ◴[14 Nov 24 21:27 UTC] No.42141411{3}[source]▶

>>42141334 #

source? didn't find anything online about this.

replies(1): >>42230225 #

18. jedberg ◴[14 Nov 24 21:30 UTC] No.42141439[source]▶

>>42141257 #

Can most humans do that? Find a job and earn money, probably. The other two? Not so much.

19. sourcepluck ◴[14 Nov 24 21:37 UTC] No.42141503{3}[source]▶

>>42140395 #

The short version of this is that I don't disagree with your interpretation of Searle, and my paragraphs immediately following the link weren't meant to be a direct description of his point with the Chinese Room thought experiment.

> while it cannot be dis-_proven_, the Chinese Room argument was meant to provide reasons against believing it

Yes, like Russell's teapot. I also think that's what Searle means.

> the "it can't be proven until it happens" part is misunderstanding: you won't know if it happens because the objective, externally available attributes don't indicate whether self-awareness (or indeed awareness at all) is present

Yes, agreed, I believe that's what Searle is saying too. I think I was maybe being ambiguous here - I wanted to say that even if you forgave the AI maximalists for ignoring all relevant philosophical work, the notion that "appearing human-like" inevitably tends to what would actually be "consciousness" or "intelligence" is more than a big claim.

Searle goes further, and I'm not sure if I follow him all the way, personally, but it's a side point.

20. jedberg ◴[14 Nov 24 21:57 UTC] No.42141652[source]▶

>>42139703 #

I will get excited when an LLM (or whatever technology is next) can solve tasks that 80%+ of adult humans can solve. Heck let's even say 80% of college graduates to make it harder.

Things like drive a car, fold laundry, run an errand, do some basic math.

You'll notice that two of those require some form of robot or mobility. I think that is key -- you can't have AGI without the ability to interact with the world in a way similar to most humans.

replies(1): >>42141904 #

21. bob1029 ◴[14 Nov 24 21:57 UTC] No.42141654[source]▶

>>42139703 #

This sounds like something more up the alley of linear genetic programming. There are some very interesting experiments out there that utilize UTMs (BrainFuck, Forth, et. al.) [0,1,2].

I've personally had some mild success getting these UTM variants to output their own children in a meta programming arrangement. The base program only has access to the valid instruction set of ~12 instructions per byte, while the task program has access to the full range of instructions and data per byte (256). By only training the base program, we reduce the search space by a very substantial factor. I think this would be similar to the idea of a self-hosted compiler, etc. I don't think there would be too much of a stretch to give it access to x86 instructions and a full VM once a certain amount of bootstrapping has been achieved.

[0]: https://arxiv.org/abs/2406.19108

[1]: https://github.com/kurtjd/brainfuck-evolved

[2]: https://news.ycombinator.com/item?id=36120286

22. olalonde ◴[14 Nov 24 21:58 UTC] No.42141664{3}[source]▶

>>42141334 #

That story was a bit blown out of proportion. He gave a research grant to the bot's creator: https://x.com/pmarca/status/1846374466101944629

23. jedberg ◴[14 Nov 24 22:00 UTC] No.42141680[source]▶

>>42139314 #

When we test kids to see if they are gifted, one of the criteria is that they have the ability to say "I don't know".

That is definitely an ability that current LLMs lack.

24. ata_aman ◴[14 Nov 24 22:31 UTC] No.42141904{3}[source]▶

>>42141652 #

So embodied cognition right?

25. lagrange77 ◴[14 Nov 24 22:41 UTC] No.42141969[source]▶

>>42139314 #

Good point!

I'm wondering wether it would count, if one would extend it with an external program, that gives it feedback during inference (by another prompt) about the correctness of it's output.

I guess it wouldn't, because these RAG tools kind of do that and i heard no one calling those self aware.

replies(1): >>42145102 #

26. Jensson ◴[15 Nov 24 03:33 UTC] No.42143746{4}[source]▶

>>42140169 #

> (That’s not to say that humans don’t tend to lose some of their flexibility over their individual lifetimes as well.)

The lifetime is the context window, the model/training is the DNA. A human in the moment isn't general intelligent, but a human over his lifetime is, the first is so much easier to try to replicate though but that is a bad target since humans aren't born like that.

27. eichi ◴[15 Nov 24 05:18 UTC] No.42144147[source]▶

>>42141257 #

This is people's true desire. Make something like that while handling critisisms and fitting products to the market.

28. littlestymaar ◴[15 Nov 24 08:58 UTC] No.42145102{3}[source]▶

>>42141969 #

> if one would extend it with an external program, that gives it feedback

If you have an external program, then by defining it's not self-awareness ;). Also, it's not about correctness per se, but about the model's ability to assess its own knowledge (making a mistake because the model was exposed to mistakes in the training data is fine, hallucinating isn't).

replies(1): >>42150305 #

29. Vampiero ◴[15 Nov 24 09:31 UTC] No.42145267[source]▶

>>42139703 #

Here is an example of a task that I do not believe this generation of LLMs can ever do but that is possible for an average human: designing a functional trivia app.

There, you don't need to invoke Turing or compiler bootstrapping. You just need one example of a use case where the accuracy of responses is mission critical

replies(1): >>42146128 #

30. alainx277 ◴[15 Nov 24 12:00 UTC] No.42146128{3}[source]▶

>>42145267 #

o1-preview managed to complete this in one attempt:

https://chatgpt.com/share/67373737-04a8-800d-bc57-de74a415e2...

I think the parent comment's challenge is more appropriate.

replies(1): >>42148745 #

31. Filligree ◴[15 Nov 24 15:04 UTC] No.42147539{3}[source]▶

>>42140807 #

That’s… what I said, yes.

32. Vampiero ◴[15 Nov 24 17:09 UTC] No.42148745{4}[source]▶

>>42146128 #

Have you personally verified that the answers are not hallucinations and that they are indeed factually true?

Oh, you just asked it to make a trivia app that feeds on JSON. Cute, but that's not what I meant. The web is full of tutorials for basic stuff like that.

To be clear I meant that LLMs can't write trivia questions and answers, thus proving that they can't produce trustworthy outputs.

And a trivia app is a toy (one might even say... a trivial example), but it's a useful demonstration of why you wouldn't put an LLM into a system on which lives depend on, let alone invest billions on it.

If you don't trust my words just go back to fiddling with your models and ask them to write a trivia quiz about a topic that you know very well. Like a TV show.

33. lagrange77 ◴[15 Nov 24 19:58 UTC] No.42150305{4}[source]▶

>>42145102 #

Yes, but that's essentially my point. Where to draw the system boundary? The brain is also composed of multiple components and does IO with external components, that are definitely not considered part of it.

34. rodgerd ◴[24 Nov 24 20:25 UTC] No.42230225{4}[source]▶

>>42141411 #

Goatseus Maximus is what you're after.

↑