Acausal is a misnomer. It's atemporal, but TDT's atemporal blackmail requires common causation: namely, the mathematical truth "how would this agent behave in this circumstance?".
So there's a simpler solution: be a human. Humans are incapable of simulating other agents simulating ourselves in the way that atemporal blackmail requires. Even if we were, we don't understand our thought processes well enough to instantiate our imagined AIs in software: we can't even write down a complete description of "that specific Roko's Basilisk you're imagining". The basic premises for TDT-style atemporal blackmail simply aren't there.
The hypothetical future AI "being able to simulate you" is irrelevant. There needs to be a bidirectional causal link between that AI's algorithm, and your here-and-now decision-making process. You aren't actually simulating the AI, only imagining what might happen if it did, so any decision the future AI (is-the-sort-of-agent-that) makes does not affect your current decisions. Even if you built Roko's Basilisk as Roko specified it, it wouldn't choose to torture anyone.
There is, of course, a stronger version of Roko's Basilisk, and one that's considerably older: evil Kantian ethics. See: any dictatorless dystopian society that harshly-punishes both deviance and non-punishment. There are plenty in fiction, though they don't seem to be all that stable in real life. (The obvious response to that idea is "don't set up a society that behaves that way".)
He knows that can't possibly work, right? Implicitly it assumes perfect invulnerability to any method of coercion, exploitation, subversion, or suffering that can be invented by an intelligence sufficiently superhuman to have escaped its natal light cone.
There may exist forms of life in this universe for which such an assumption is safe. Humanity circa 2024 seems most unlikely to be among them.
Though in this case, in his defense, average people will never hear about Roko's Basilisk.
It's only within the past decade or so that the bulk of human population lives in an urban setting. Until that point most people did not and most people gone fishing, seen a carcass hanging in a butcher's shop, killed for food at least once, had a holiday on a farm if not worked on one or grown up farm adjacent.
By most people, of course, I refer to globally.
Throughout history vegetarianism was relatively rare save in vegatarian cultures (Hindi, et al) and in those cultures where it was rare people were all too aware of the animals they killed to eat. Many knew that pigs were smart and that dogs and cats interact with humans, etc.
Eliezer was correct to think that people who killed to eat thought about their food animals differently but I suspect it had less to do with sapience and more to do with thinking animals to be of a lesser order, or there to be eaten and to be nutured so there would be more for the years to come.
This is most evident in, sat, hunter societies, aboriginals and bushmen, who have extensive stories about animals, how they think, how they move and react, when they breed, how many can be taken, etc. They absolutely attribute a differing kind of thought, and they hunt them and try not to over tax the populations.
People who are not vegetarian but have never cared for or killed a farm animal were very likely (in most parts of the world) raised by people that have.
Even in the USofA much of the present generations are not far removed from grandparents who owned farms | worked farms | hunted.
The present day is a continuum from yesterday. Change can happen, but the current conditions are shaped by the prior conditions.
It's a bit odd that someone would like to argue on the topic, but also either not have heard that or not recognize the ha-ha-only-serious nature of it.
There are stronger versions of "basilisks" in the actual theory, but I've had people say not to talk about them. They mostly just get around various hole-patching schemes designed to prevent the issue, but are honestly more of a problem for certain kinds of utilitarians who refuse to do certain kinds of things.
You are very much right about the "being human" thing, someone go tell that to Zvi Mowshowitz. He was getting on Aschenbrenner's case for no reason.
Edit: oh, you don't need a "complete description" of your acausal bargaining partner, something something "algorithmic similarity".
Acausal blackmail only works if one agent U predicts the likely future (or, otherwise not-yet-having-causal-influence) existence of another agent V, who would take actions so that if U’s actions aren’t in accordance with V’s preferences, then V’s actions will do harm to U(‘s interests) (eventually). But, this only works if U predicts the likely possible existence of V and V’s blackmail.
If V is having a causal influence of U, in order to do the blackmail, that’s just ordinary coercion. And, if U doesn’t anticipate the existence (and preferences) of V, then U won’t cooperate with any such attempts at acausal blackmail.
(… is “blackmail” really the right word? It isn’t like there’s a threat to reveal a secret, which I typically think of as central to the notion of blackmail.)
If "algorithmic similarity" were a meaningful concept, Dijkstra's programme would have got off the ground, and we wouldn't be struggling so much to analyse the behaviour of the 6-state Turing machines.
(And on the topic of time machines: if Roko's Basilisk could actually travel back in time to ensure its own creation, Skynet-style, the model of time travel implies it could just instantiate itself directly, skipping the human intermediary.)
Timeless decision theory's atemporal negotiation is a concern for small, simple intelligences with access to large computational resources that they cannot verify the results of, and the (afaict impossible) belief that they have a copy of their negotiation partner's mind. A large intelligence might choose to create such a small intelligence, and then defer to it, but absent a categorical imperative to do so, I don't see why they would.
TDT theorists model the "large computational resources" and "copy of negotiation partner's mind" as an opaque oracle, and then claim that the superintelligence will just be so super that it can do these things. But the only way I can think of to certainly get a copy of your opponent's mind without an oracle, aside from invasive physical inspection (at which point you control your opponent, and your only TDT-related concern is that this is a simulation and you might fail a purity test with unknown rules), is bounding your opponent's size and then simulating all possible minds that match your observations of your opponent's behaviour. (Symbolic reasoning can beat brute-force to an extent, but the size of the simplest symbolic reasoner places a hard limit on how far you can extend that approach.) But by Cantor's theorem, this precludes your opponent doing the same to you (even if you both have literally infinite computational power – which you don't); and it's futile anyway because if your estimate of your opponent's size is a few bits too low, the new riddle of induction renders your efforts moot.
So I don't think there are any stronger versions of basilisks, unless the universe happens to contain something like the Akashic records (and the kind from https://qntm.org/ra doesn't count).
Your "subjunctively biconditional" is my "causal", because I'm wearing my Platonist hat.
Being very good at an arbitary specific game isn't the same as being smart. Prrendit that the universe is the same as your game is not wise.
On reflection, I could've inferred that from his crowd's need for a concept of "typical mind fallacy." I suppose I hadn't thought it all the way through.
I'm in a weird spot on this, I think. I can follow most of the reasoning behind LW/EA/generally "Yudkowskyish" analysis and conclusions, but rarely find anything in them which I feel requires taking very seriously, due both to weak postulates too strongly favored, and to how those folks can't go to the corner store without building a moon rocket first.
I recognize the evident delight in complexity for its own sake, and I do share it. But I also recognize it as something I grew far enough out of to recognize when it's inapplicable and (mostly!) avoid indulging it then.
The thought can feel somewhat strange, because how I see those folks now palpably has much in common with how I myself was often seen in childhood, as the bright nerd I then was. (Both words were often used, not always with unequivocal approbation.) Given a different upbringing I might be solidly in the same cohort, if about as mediocre there as here. But from what I've seen of the results, there seems no substantive reason to regret the difference in outcome.
As you said, it's near useless on Earth (don't need to predict what you can control), the nearest claimed application is the various possible causal diamond overlaps between "our" ASI and various alien ASIs, where each would be unable to prevent the other from existing in a causal manner.
Remember that infinite precision is an infinity too and does not really exist. As well as infinite time, infinite storage, etc. You probably don't even need infinite precision to avoid cheating on your imaginary girlfriend, just some sort of "philosophical targeting accuracy". But, you know, the only reason that's true is that everything related to imaginary girlfriends is made up.
In practice, we can still analyse programs, because the really gnarly examples are things like program-analysis programs (see e.g. the usual proof of the undecidability of the Halting problem), and those don't tend to come up all that often. Except, TDT thought experiments posit program-analysis programs – and worse, they're analysing each other…
Maybe there's some neat mathematics to attack large swathes of the solution space, but I have no reason to believe such a trick exists, and we have many reasons to believe it doesn't. (I'm pretty sure I could prove that no such trick exists, if I cared to – but I find low-level proofs like that unusually difficult, so that wouldn't be a good use of my time).
> Remember that infinite precision is an infinity too and does not really exist.
For finite discrete systems, infinite precision does exist. The bytestring representing this sentence is "infinitely-precise". (Infinitely-accurate still doesn't exist.)
Now, if it’s irrational to do so, then it’s irrational to do so, even though it is possible. But I’m not so sure it is irrational. If one is considering situations with things as powerful and oppositional as that, it seems like, unless one has a full solid theory of acausal trade ready and has shown that it is beneficial, that it is probably best to blanket refuse all acausal threats, so that they don’t influence what actually happens here.
Unfortunately (or perhaps fortunately, given how we would misuse such an ability), strong precommitments are not available to humans. Our ability to self-modify is vague and bounded. In our organizations and other intelligent tools, we probably should make such precommitments.