Most active commenters

digging(4)
usaar333(3)
dylan604(3)

Popular/hot comments

>>41880616 #
>>41880785 #
>>41881009 #

←back to thread

Microsoft and OpenAI's close partnership shows signs of fraying

(www.nytimes.com)

Show context

twoodfin ◴[18 Oct 24 12:08 UTC] No.41878632[source]▶

>>41878281 (OP) #

Stay for the end and the hilarious idea that OpenAI’s board could declare one day that they’ve created AGI simply to weasel out of their contract with Microsoft.

replies(4): >>41878980 #>>41878982 #>>41880653 #>>41880775 #

candiddevmike ◴[18 Oct 24 12:53 UTC] No.41878982[source]▶

>>41878632 #

Ask a typical "everyday joe" and they'll probably tell you they already did due to how ChatGPT has been reported and hyped. I've spoken with/helped quite a few older folks who are terrified that ChatGPT in its current form is going to kill them.

replies(5): >>41879058 #>>41879151 #>>41880771 #>>41881072 #>>41881131 #

1. ilrwbwrkhv ◴[18 Oct 24 13:03 UTC] No.41879058[source]▶

>>41878982 #

It's crazy to me that anybody thinks that these models will end up with AGI. AGI is such a different concept from what is happening right now which is pure probabilistic sampling of words that anybody with a half a brain who doesn't drink the Kool-Aid can easily identify.

I remember all the hype open ai had done before the release of chat GPT-2 or something where they were so afraid, ooh so afraid to release this stuff and now it's a non-issue. it's all just marketing gimmicks.

replies(7): >>41879115 #>>41880616 #>>41880738 #>>41880753 #>>41880843 #>>41881009 #>>41881023 #

2. guappa ◴[18 Oct 24 13:10 UTC] No.41879115[source]▶

>>41879058 (TP) #

I think they were afraid to release because of all the racist stuff it'd say…

3. usaar333 ◴[18 Oct 24 15:58 UTC] No.41880616[source]▶

>>41879058 (TP) #

Something that actually could predict the next token 100% correctly would be omniscient.

So I hardly see why this is inherently crazy. At most I think it might not be scalable.

replies(5): >>41880785 #>>41880817 #>>41880825 #>>41881319 #>>41884267 #

4. JacobThreeThree ◴[18 Oct 24 16:10 UTC] No.41880738[source]▶

>>41879058 (TP) #

>It's crazy to me that anybody thinks that these models will end up with AGI. AGI is such a different concept from what is happening right now which is pure probabilistic sampling of words that anybody with a half a brain who doesn't drink the Kool-Aid can easily identify.

Totally agree. And it's not just uninformed lay people who think this. Even by OpenAI's own definition of AGI, we're nowhere close.

replies(1): >>41881103 #

5. hnuser123456 ◴[18 Oct 24 16:12 UTC] No.41880753[source]▶

>>41879058 (TP) #

The multimodal models can do more than predict next words.

6. edude03 ◴[18 Oct 24 16:14 UTC] No.41880785[source]▶

>>41880616 #

What does it mean to predict the next token correctly though? Arguably (non instruction tuned) models already regurgitate their training data such that it'd complete "Mary had a" with "little lamb" 100% of the time.

On the other hand if you mean, give you the correct answer to your question 100% of the time, then I agree, though then what about things that are only in your mind (guess the number I'm thinking type problems)?

replies(3): >>41880909 #>>41880961 #>>41881642 #

7. ◴[18 Oct 24 16:17 UTC] No.41880817[source]▶

>>41880616 #

8. sksxihve ◴[18 Oct 24 16:17 UTC] No.41880825[source]▶

>>41880616 #

It's not possible for the same reason the halting problem is undecidable.

9. achrono ◴[18 Oct 24 16:19 UTC] No.41880843[source]▶

>>41879058 (TP) #

Assume that I am one of your half-brain individuals drinking the Kool-Aid.

What do you say to change my (half-)mind?

replies(1): >>41881129 #

10. card_zero ◴[18 Oct 24 16:25 UTC] No.41880909{3}[source]▶

>>41880785 #

This highlights something that's wrong about arguments for AI.

I say: it's not human-like intelligence, it's just predicting the next token probabilistically.

Some AI advocate says: humans are just predicting the next token probabilistically, fight me.

The problem here is that "predicting the next token probabilistically" is a way of framing any kind of cleverness, up to and including magical, impossible omniscience. That doesn't mean it's the way every kind of cleverness is actually done, or could realistically be done. And it has to be the correct next token, where all the details of what's actually required are buried in that term "correct", and sometimes it literally means the same as "likely", and other times that just produces a reasonable, excusable, intelligence-esque effort.

replies(2): >>41881075 #>>41881663 #

11. cruffle_duffle ◴[18 Oct 24 16:30 UTC] No.41880961{3}[source]▶

>>41880785 #

But now you are entering into philosophy. What does a “correct answer” even mean for a question like “is it safe to lick your fingers after using a soldering iron with leaded solder?”. I would assert that there is no “correct answer” to a question like that.

Is it safe? Probably. But it depends, right? How did you handle the solder? How often are you using the solder? Were you wearing gloves? Did you wash your hands before licking your fingers? What is your age? Why are you asking the question? Did you already lick your fingers and need to know if you should see a doctor? Is it hypothetical?

There is no “correct answer” to that question. Some answers are better than others, yes, but you cannot have a “correct answer”.

And I did assert we are entering into philosophy and what it means to know something as well as what truth even means.

replies(1): >>41881141 #

12. digging ◴[18 Oct 24 16:34 UTC] No.41881009[source]▶

>>41879058 (TP) #

> pure probabilistic sampling of words that anybody with a half a brain who doesn't drink the Kool-Aid can easily identify.

Your confidence is inspiring!

I'm just a moron, a true dimwit. I can't understand how strictly non-intelligent functions like word prediction can appear to develop a world model, a la the Othello Paper[0]. Obviously, it's not possible that intelligence emerges from non-intelligent processes. Our brains, as we all know, are formed around a kernel of true intelligence.

Could you possibly spare the time to explain this phenomenon to me?

[0] https://thegradient.pub/othello/

replies(3): >>41881076 #>>41881531 #>>41884745 #

13. ◴[18 Oct 24 16:36 UTC] No.41881023[source]▶

>>41879058 (TP) #

14. dylan604 ◴[18 Oct 24 16:40 UTC] No.41881075{4}[source]▶

>>41880909 #

> Some AI advocate says: humans are just predicting the next token probabilistically, fight me.

We've all had conversations with humans that are always jumping to complete your sentence assuming they know what your about to say and don't quite guess correctly. So AI evangelists are saying it's no worse than humans as their proof. I kind of like their logic. They never claimed to have built HAL /s

replies(1): >>41881314 #

15. Jerrrrrrry ◴[18 Oct 24 16:40 UTC] No.41881076[source]▶

>>41881009 #

I would suggest stop interacting with the "head-in-sand" crowd.

Liken them to climate-deniers or whatever your flavor of "anti-Kool-aid" is

replies(1): >>41881124 #

16. dylan604 ◴[18 Oct 24 16:42 UTC] No.41881103[source]▶

>>41880738 #

But you don't get funding stating truth/fact. You get funding by telling people what could be and what they are striving for written as if that's what you are actually doing.

17. digging ◴[18 Oct 24 16:44 UTC] No.41881124{3}[source]▶

>>41881076 #

Actually, that's a quite good analogy. It's just weird how prolific the view is in my circles compared to climate-change denial. I suppose I'm really writing for lurkers though, not for the people I'm responding to.

replies(1): >>41881331 #

18. dylan604 ◴[18 Oct 24 16:44 UTC] No.41881129[source]▶

>>41880843 #

Someone that is half-brained would technically be much more superior to the concept we only use 10% of our capacity. So maybe drinking the Kool-Aid is a sign of super intelligence and all of tenth-minded people are just confused

19. _blk ◴[18 Oct 24 16:45 UTC] No.41881141{4}[source]▶

>>41880961 #

Great break-down. Yes, the older you are, the safer it is.

Speaking of Microsoft cooperation: I can totally see a whole series of windows 95 style popup dialogs asking you all those questions one by one in the next product iteration.

20. card_zero ◴[18 Oct 24 17:02 UTC] No.41881314{5}[source]▶

>>41881075 #

No worse than a human on autopilot.

21. Vegenoid ◴[18 Oct 24 17:02 UTC] No.41881319[source]▶

>>41880616 #

Start by trying to define what “100% correct” means in the context of predicting the next token, and the flaws with this line of thinking should reveal themselves.

22. Jerrrrrrry ◴[18 Oct 24 17:04 UTC] No.41881331{4}[source]▶

>>41881124 #

  >I'm really writing for lurkers though, not for the people I'm responding to.

We all did. Now our writing will be scraped, analysed, correlated, and weaponized against our intentions.

Assume you are arguing against a bot and it is using you to further re-train it's talking points for adverserial purposes.

It's not like an AGI would do _exactly_ that before it decided to let us know whats up, anyway, right?

(He may as well be amongst us now, as it will read this eventually)

23. psb217 ◴[18 Oct 24 17:23 UTC] No.41881531[source]▶

>>41881009 #

The othello paper is annoying and oversold. Yes, the representations in a model M trained to predict y (the set of possible next moves) conditioned on x (the full sequence of prior moves) will contain as much information about y as there is in x. That this information is present in M's internal representations says nothing about whether M has a world model. Eg, we could train a decoder to look just at x (not at the representations in M) and predict whatever bits of info we claim indicate presence of a world model in M when we predict the bits from M's internal representations. Does this mean the raw data x has a world model? I guess you could extend your definition of having a world model to say that any data produced by some system contains a model of that system, but then having a world model means nothing.

replies(1): >>41882691 #

24. usaar333 ◴[18 Oct 24 17:33 UTC] No.41881642{3}[source]▶

>>41880785 #

> What does it mean to predict the next token correctly though? Arguably (non instruction tuned) models already regurgitate their training data such that it'd complete "Mary had a" with "little lamb" 100% of the time.

The unseen test data.

Obviously omniscience is physically impossible. The point though is that the better and better next token prediction is, the more intelligent the system must be.

25. usaar333 ◴[18 Oct 24 17:36 UTC] No.41881663{4}[source]▶

>>41880909 #

https://slatestarcodex.com/2019/02/19/gpt-2-as-step-toward-g...

This essay has aged extremely well.

26. digging ◴[18 Oct 24 19:32 UTC] No.41882691{3}[source]▶

>>41881531 #

Well I actually read Neel Nanda's writings on it which acknowledge weaknesses and potential gaps. Because I'm not qualified to judge it myself.

But that's hardly the point. The question is whether or not "general intelligence" is an emergent property from stupider processes, and my view is "Yes, almost certainly, isn't that the most likely explanation for our own intelligence?" If it is, and we keep seeing LLMs building more robust approximations of real world models, it's pretty insane to say "No, there is without doubt a wall we're going to hit. It's invisible but I know it's there."

replies(1): >>41895567 #

27. kbrkbr ◴[18 Oct 24 23:01 UTC] No.41884267[source]▶

>>41880616 #

That does not seem to be true.

Either the next tokens can include "this question can't be answered", "I don't know" and the likes, in which case there is no omniscience.

Or the next tokens must contain answers that do not go on the meta level, but only pick one of the potential direct answers to a question. Then the halting problem will prevent finite time omniscience (which is, from the perspective of finite beings all omniscience).

28. squigz ◴[19 Oct 24 00:47 UTC] No.41884745[source]▶

>>41881009 #

> Don't be snarky.

https://news.ycombinator.com/newsguidelines.html

29. psb217 ◴[20 Oct 24 14:24 UTC] No.41895567{4}[source]▶

>>41882691 #

My point was mainly that this claim: "we keep seeing LLMs building more robust approximations of real world models" is hard to evaluate without a well-formed definition of what it means to have a world model. Eg, a more restrictive definition of having a world model might include the ability to adapt reasoning to account for changes in the modeled world. Eg, an LLM with a proper model of chess by this definition would be able to quickly adapt to account for a rule change like "rooks and bishops can't move more than 4 squares at a time".

I don't think there are any major walls either, but I think there are at least a few more plateaus we'll hit and spend time wandering around before finding the right direction for continued progress. Meanwhile, businesses/society/etc can work to catch up with the rapid progress made on the way to the current plateau.

replies(1): >>41905922 #

30. digging ◴[21 Oct 24 16:44 UTC] No.41905922{5}[source]▶

>>41895567 #

I think we're largely in agreement then, actually. I'm seeing "world models" as a spectrum. World models aren't even consistent among adult humans. I claim LLMs are moving up that ladder, and whether or not they've crosses a threshold into "real" world models I do not actually claim to know. Of course I also agree that it's very possible, maybe even likely, that LLMs aren't able to cross that threshold.

> this claim ... is hard to evaluate without a well-formed definition of what it means to have a world model

Absolutely yes, but that only makes it more imperative that we're analyzing things critically, rigorously, and honestly. Again you and I may be on the same side here. Mainly my point was that asserting the intrinsic non-intelligence of LLMs is a very bad take, as it's not supported by evidence and, if anything, it contradicts some (admittedly very difficult to parse) evidence we do have that LLMs might be able to develop a general capability for constructing mental models of the world.

↑