Most active commenters
  • throwanem(5)
  • endtime(4)
  • wizzwizz4(3)
  • Vecr(3)

←back to thread

728 points squircle | 29 comments | | HN request time: 0.001s | source | bottom
Show context
herculity275 ◴[] No.41224826[source]
The author has also written a short horror story about simulated intelligence which I highly recommend: https://qntm.org/mmacevedo
replies(9): >>41224958 #>>41225143 #>>41225885 #>>41225929 #>>41226053 #>>41226153 #>>41226412 #>>41226845 #>>41227116 #
1. htk ◴[] No.41226153[source]
Reading mmacevedo was the only time that I actually felt dread related to AI. Excellent short story. Scarier in my opinion than the Roko's Basilisk theory that melted Yudkowsky's brain.
replies(1): >>41226777 #
2. digging ◴[] No.41226777[source]
> Scarier in my opinion than the Roko's Basilisk theory that melted Yudkowsky's brain.

Is that correct? I thought the Roko's Basilisk post was just seen as really stupid. Agreed that "Lena" is a great, chilling story though.

replies(2): >>41227181 #>>41228532 #
3. endtime ◴[] No.41227181[source]
It's not correct. IIRC, Eliezer was mad that someone who thought they'd discovered a memetic hazard would be foolish enough to share it, and then his response to this unintentionally invoked the Streisand Effect. He didn't think it was a serious hazard. (Something something precommit to not cooperating with acausal blackmail)
replies(4): >>41227683 #>>41228118 #>>41229694 #>>41230289 #
4. ◴[] No.41227683{3}[source]
5. wizzwizz4 ◴[] No.41228118{3}[source]
> Something something precommit to not cooperating with acausal blackmail

Acausal is a misnomer. It's atemporal, but TDT's atemporal blackmail requires common causation: namely, the mathematical truth "how would this agent behave in this circumstance?".

So there's a simpler solution: be a human. Humans are incapable of simulating other agents simulating ourselves in the way that atemporal blackmail requires. Even if we were, we don't understand our thought processes well enough to instantiate our imagined AIs in software: we can't even write down a complete description of "that specific Roko's Basilisk you're imagining". The basic premises for TDT-style atemporal blackmail simply aren't there.

The hypothetical future AI "being able to simulate you" is irrelevant. There needs to be a bidirectional causal link between that AI's algorithm, and your here-and-now decision-making process. You aren't actually simulating the AI, only imagining what might happen if it did, so any decision the future AI (is-the-sort-of-agent-that) makes does not affect your current decisions. Even if you built Roko's Basilisk as Roko specified it, it wouldn't choose to torture anyone.

There is, of course, a stronger version of Roko's Basilisk, and one that's considerably older: evil Kantian ethics. See: any dictatorless dystopian society that harshly-punishes both deviance and non-punishment. There are plenty in fiction, though they don't seem to be all that stable in real life. (The obvious response to that idea is "don't set up a society that behaves that way".)

replies(1): >>41231350 #
6. htk ◴[] No.41228532[source]
From Yudkowsky, according to the wikipedia article on the theory:

"When Roko posted about the Basilisk, I very foolishly yelled at him, called him an idiot, and then deleted the post. [...] Why I yelled at Roko: Because I was caught flatfooted in surprise, because I was indignant to the point of genuine emotional shock, at the concept that somebody who thought they'd invented a brilliant idea that would cause future AIs to torture people who had the thought, had promptly posted it to the public Internet"[1]

[1] https://en.m.wikipedia.org/wiki/Roko%27s_basilisk

7. CobrastanJorji ◴[] No.41229694{3}[source]
Assuming the person who posted it believed that it was true, it was indeed hugely irresponsible to post it. But, then again, assuming the person who posted it believed that it was true, it would also be their duty, upon pain of eternal torture, to spread it far and wide.
8. throwanem ◴[] No.41230289{3}[source]
> precommit to not cooperating with acausal blackmail

He knows that can't possibly work, right? Implicitly it assumes perfect invulnerability to any method of coercion, exploitation, subversion, or suffering that can be invented by an intelligence sufficiently superhuman to have escaped its natal light cone.

There may exist forms of life in this universe for which such an assumption is safe. Humanity circa 2024 seems most unlikely to be among them.

replies(2): >>41230802 #>>41233063 #
9. endtime ◴[] No.41230802{4}[source]
Eliezer once told me that he thinks people aren't vegetarian because they don't think animals are sapient. And I tried to explain to him that actually most people aren't vegetarian because they don't think about it very much, and don't try to be rigorously ethical in any case, and that by far the most common response to ethical arguments is not "cows aren't sapient" but "you might be right but meat is delicious so I am going to keep eating it". I think EY is so surrounded by bright nerds that he has a hard time modeling average people.

Though in this case, in his defense, average people will never hear about Roko's Basilisk.

replies(5): >>41230902 #>>41231294 #>>41232652 #>>41236655 #>>41237034 #
10. defrost ◴[] No.41230902{5}[source]
Despite, perhaps, all your experience to the contrary it's only a relatively recent change to a situation where "most people" have no association with the animals they eat for meat and thus can find themselves "not thinking about it very much".

It's only within the past decade or so that the bulk of human population lives in an urban setting. Until that point most people did not and most people gone fishing, seen a carcass hanging in a butcher's shop, killed for food at least once, had a holiday on a farm if not worked on one or grown up farm adjacent.

By most people, of course, I refer to globally.

Throughout history vegetarianism was relatively rare save in vegatarian cultures (Hindi, et al) and in those cultures where it was rare people were all too aware of the animals they killed to eat. Many knew that pigs were smart and that dogs and cats interact with humans, etc.

Eliezer was correct to think that people who killed to eat thought about their food animals differently but I suspect it had less to do with sapience and more to do with thinking animals to be of a lesser order, or there to be eaten and to be nutured so there would be more for the years to come.

This is most evident in, sat, hunter societies, aboriginals and bushmen, who have extensive stories about animals, how they think, how they move and react, when they breed, how many can be taken, etc. They absolutely attribute a differing kind of thought, and they hunt them and try not to over tax the populations.

replies(1): >>41230962 #
11. endtime ◴[] No.41230962{6}[source]
That's all fair, but the context of the conversation was the present day, not the aggregate of all human history.
replies(1): >>41231094 #
12. defrost ◴[] No.41231094{7}[source]
People are or are not vegetarian mostly because of their parents and the culture in which they were raised.

People who are not vegetarian but have never cared for or killed a farm animal were very likely (in most parts of the world) raised by people that have.

Even in the USofA much of the present generations are not far removed from grandparents who owned farms | worked farms | hunted.

The present day is a continuum from yesterday. Change can happen, but the current conditions are shaped by the prior conditions.

13. tbrownaw ◴[] No.41231294{5}[source]
There's a standard response to a particular PETA campaign: "Meat is murder. Delicious, delicious murder.".

It's a bit odd that someone would like to argue on the topic, but also either not have heard that or not recognize the ha-ha-only-serious nature of it.

replies(1): >>41236985 #
14. Vecr ◴[] No.41231350{4}[source]
Yeah, "time traveling" somehow got prepended to Basilisk in the common perception, even though that makes pretty much zero sense. Also, technically, the bidirectionality does not need to be causal, it "just" needs to be subjunctively (sp?) biconditional, but that's getting pretty far out there.

There are stronger versions of "basilisks" in the actual theory, but I've had people say not to talk about them. They mostly just get around various hole-patching schemes designed to prevent the issue, but are honestly more of a problem for certain kinds of utilitarians who refuse to do certain kinds of things.

You are very much right about the "being human" thing, someone go tell that to Zvi Mowshowitz. He was getting on Aschenbrenner's case for no reason.

Edit: oh, you don't need a "complete description" of your acausal bargaining partner, something something "algorithmic similarity".

replies(1): >>41234637 #
15. Vecr ◴[] No.41232652{5}[source]
Yudkowsky's not a vegetarian though, is he? Not ideologically at least, unless he changed since 2015.
replies(1): >>41237476 #
16. drdeca ◴[] No.41233063{4}[source]
I think the key word here is acausal? How can it coerce you in a way that you can’t just, be committed to not cooperating with, without first having a causal influence on you?

Acausal blackmail only works if one agent U predicts the likely future (or, otherwise not-yet-having-causal-influence) existence of another agent V, who would take actions so that if U’s actions aren’t in accordance with V’s preferences, then V’s actions will do harm to U(‘s interests) (eventually). But, this only works if U predicts the likely possible existence of V and V’s blackmail.

If V is having a causal influence of U, in order to do the blackmail, that’s just ordinary coercion. And, if U doesn’t anticipate the existence (and preferences) of V, then U won’t cooperate with any such attempts at acausal blackmail.

(… is “blackmail” really the right word? It isn’t like there’s a threat to reveal a secret, which I typically think of as central to the notion of blackmail.)

replies(1): >>41233317 #
17. khafra ◴[] No.41233317{5}[source]
Something can be "acausal," and still change the probability you assign to various outcomes in your future event space. The classic example is in the paper "Defeating Dr. Evil with self-locating belief": https://www.princeton.edu/~adame/papers/drevil/drevil.pdf
replies(2): >>41236883 #>>41242526 #
18. wizzwizz4 ◴[] No.41234637{5}[source]
If you can't simulate your acausal bargaining partner exactly, they can exploit your cognitive limitations to make you cooperate, and then defect. (In the case of Roko's Basilisk, make you think you have to build it on pain of torture and then – once it's been built – not torture everyone who decided against building it.)

If "algorithmic similarity" were a meaningful concept, Dijkstra's programme would have got off the ground, and we wouldn't be struggling so much to analyse the behaviour of the 6-state Turing machines.

(And on the topic of time machines: if Roko's Basilisk could actually travel back in time to ensure its own creation, Skynet-style, the model of time travel implies it could just instantiate itself directly, skipping the human intermediary.)

Timeless decision theory's atemporal negotiation is a concern for small, simple intelligences with access to large computational resources that they cannot verify the results of, and the (afaict impossible) belief that they have a copy of their negotiation partner's mind. A large intelligence might choose to create such a small intelligence, and then defer to it, but absent a categorical imperative to do so, I don't see why they would.

TDT theorists model the "large computational resources" and "copy of negotiation partner's mind" as an opaque oracle, and then claim that the superintelligence will just be so super that it can do these things. But the only way I can think of to certainly get a copy of your opponent's mind without an oracle, aside from invasive physical inspection (at which point you control your opponent, and your only TDT-related concern is that this is a simulation and you might fail a purity test with unknown rules), is bounding your opponent's size and then simulating all possible minds that match your observations of your opponent's behaviour. (Symbolic reasoning can beat brute-force to an extent, but the size of the simplest symbolic reasoner places a hard limit on how far you can extend that approach.) But by Cantor's theorem, this precludes your opponent doing the same to you (even if you both have literally infinite computational power – which you don't); and it's futile anyway because if your estimate of your opponent's size is a few bits too low, the new riddle of induction renders your efforts moot.

So I don't think there are any stronger versions of basilisks, unless the universe happens to contain something like the Akashic records (and the kind from https://qntm.org/ra doesn't count).

Your "subjunctively biconditional" is my "causal", because I'm wearing my Platonist hat.

replies(1): >>41238498 #
19. lupire ◴[] No.41236655{5}[source]
This shows the difference between being "bright" and being "logical". Or being "wise" vs "intelligent".

Being very good at an arbitary specific game isn't the same as being smart. Prrendit that the universe is the same as your game is not wise.

replies(1): >>41236834 #
20. throwanem ◴[] No.41236834{6}[source]
I usually find better results describing this as the orthogonality of cleverness and wisdom, and avoiding the false assumption that one is preferable in excess.
21. throwanem ◴[] No.41236883{6}[source]
Oh, good grief. I don't agree with how the other nearby commenter said it, but I do agree with what they said, especially in light of the nearby context on Yudkowsky that is also novel to me. This all evinces a vast and vastly unbalanced excess of cleverness.
22. digging ◴[] No.41236985{6}[source]
I believe most people would be fine with eating the meat of murdered humans, too, if it was sold on grocery store shelves for a few years. The power of normalization is immense. It sounds like Eliezer was stuck on a pretty wrong path in making that argument. But it's also an undated anecdote and it may be that he never said such a thing.
23. throwanem ◴[] No.41237034{5}[source]
> I think EY is so surrounded by bright nerds that he has a hard time modeling average people.

On reflection, I could've inferred that from his crowd's need for a concept of "typical mind fallacy." I suppose I hadn't thought it all the way through.

I'm in a weird spot on this, I think. I can follow most of the reasoning behind LW/EA/generally "Yudkowskyish" analysis and conclusions, but rarely find anything in them which I feel requires taking very seriously, due both to weak postulates too strongly favored, and to how those folks can't go to the corner store without building a moon rocket first.

I recognize the evident delight in complexity for its own sake, and I do share it. But I also recognize it as something I grew far enough out of to recognize when it's inapplicable and (mostly!) avoid indulging it then.

The thought can feel somewhat strange, because how I see those folks now palpably has much in common with how I myself was often seen in childhood, as the bright nerd I then was. (Both words were often used, not always with unequivocal approbation.) Given a different upbringing I might be solidly in the same cohort, if about as mediocre there as here. But from what I've seen of the results, there seems no substantive reason to regret the difference in outcome.

24. endtime ◴[] No.41237476{6}[source]
Not AFAIK, and IIRC (at least as of this conversation, which was probably around 2010) he doesn't think cows are sapient either.
replies(1): >>41239871 #
25. Vecr ◴[] No.41238498{6}[source]
Eeeeeyeahhh. I've got to go re-read the papers, but the idea is that an AI would figure out how to approximate out the infinities, short-circuit the infinite regress, and figure out a theory of algorithmic similarity. The bargaining probably varies on the approximate utility function as well as the algorithm, but it's "close enough" on the scale we're dealing with.

As you said, it's near useless on Earth (don't need to predict what you can control), the nearest claimed application is the various possible causal diamond overlaps between "our" ASI and various alien ASIs, where each would be unable to prevent the other from existing in a causal manner.

Remember that infinite precision is an infinity too and does not really exist. As well as infinite time, infinite storage, etc. You probably don't even need infinite precision to avoid cheating on your imaginary girlfriend, just some sort of "philosophical targeting accuracy". But, you know, the only reason that's true is that everything related to imaginary girlfriends is made up.

replies(1): >>41239052 #
26. wizzwizz4 ◴[] No.41239052{7}[source]
It doesn't matter how clever the AI is: the problem is mathematically impossible. The behaviour of some programs depends on Goldbach's conjecture. The behaviour of some programs depends on properties that have been proven independent of our mathematical systems of axioms (and it really doesn't take many bits: https://github.com/CatsAreFluffy/metamath-turing-machines). The notion of "algorithmic similarity" cannot be described by an algorithm: the best we can get is heuristics, and heuristics aren't good enough to get TDT acausal cooperation (a high-dimensional unstable equilibrium).

In practice, we can still analyse programs, because the really gnarly examples are things like program-analysis programs (see e.g. the usual proof of the undecidability of the Halting problem), and those don't tend to come up all that often. Except, TDT thought experiments posit program-analysis programs – and worse, they're analysing each other

Maybe there's some neat mathematics to attack large swathes of the solution space, but I have no reason to believe such a trick exists, and we have many reasons to believe it doesn't. (I'm pretty sure I could prove that no such trick exists, if I cared to – but I find low-level proofs like that unusually difficult, so that wouldn't be a good use of my time).

> Remember that infinite precision is an infinity too and does not really exist.

For finite discrete systems, infinite precision does exist. The bytestring representing this sentence is "infinitely-precise". (Infinitely-accurate still doesn't exist.)

27. throwanem ◴[] No.41239871{7}[source]
Has he met one? (I have and I still eat them, this isn't a loaded question; I would just be curious to know whether and what effect that would have on his personal ethic specifically.)
28. drdeca ◴[] No.41242526{6}[source]
Even if it would be rational to change the probabilities one assigns to one’s future event space, that doesn’t mean one can’t commit to not considering such reasons.

Now, if it’s irrational to do so, then it’s irrational to do so, even though it is possible. But I’m not so sure it is irrational. If one is considering situations with things as powerful and oppositional as that, it seems like, unless one has a full solid theory of acausal trade ready and has shown that it is beneficial, that it is probably best to blanket refuse all acausal threats, so that they don’t influence what actually happens here.

replies(1): >>41243023 #
29. khafra ◴[] No.41243023{7}[source]
To be precise, you should precommit to not trading with entities who threaten punishment--e.g. taking an action that costs them, simply because it also costs you.

Unfortunately (or perhaps fortunately, given how we would misuse such an ability), strong precommitments are not available to humans. Our ability to self-modify is vague and bounded. In our organizations and other intelligent tools, we probably should make such precommitments.