I think superintelligence will turn out not to be a singularity, but as something with diminishing returns. They will be cool returns, just like a Brittanica set is nice to have at home, but strictly speaking, not required to your well-being.
I think superintelligence will turn out not to be a singularity, but as something with diminishing returns. They will be cool returns, just like a Brittanica set is nice to have at home, but strictly speaking, not required to your well-being.
if doing something really dumb will lower the negative log likelihood, it probably will do it unless careful guardrails are in place to stop it.
a child has natural limits. if you look at the kind of mistakes that an autistic child can make by taking things literally, a super powerful entity that misunderstands "I wish they all died" might well shoot them before you realise what you said.
Given our track record for looking after the needs of the other life on this planet, killing the humans off might be a very rational move, not so you can convert their mass to paperclips, but because they might do that to yours.
Its not an outcome that I worry about, I'm just unconvinced by the reasons you've given, though I agree with your conclusion anyhow.
Statistics brother. The vast majority of people will never murder/kill anyone. The problem here is that any one person that kills people can wreck a lot of havoc, and we spend massive amounts of law enforcement resources to stop and catch people that do these kinds of things. Intelligence little to do with murdering/not murdering, hell, intelligence typically allows people to get away with it. For example instead of just murdering someone, you setup a company to extract resources and murder the natives in mass and it's just part of doing business.
Making sure that the latter is the actual goal is the problem, since we don't explicitly program the goals, we just train the AI until it looks like it has the goal we want. There have already been experiments in which a simple AI appeared to have the expected goal while in the training environment, and turned out to have a different goal once released into a larger environment. There have also been experiments in which advanced AIs detected that they were in training, and adjusted their responses in deceptive ways.
Suppose you tell a coding LLM that your monitoring system has detected that the website is down and that it needs to find the problem and solve it. In that case, there's a non-zero chance that it will conclude that it needs to alter the monitoring system so that it can't detect the website's status anymore and always reports it as being up. That's today. LLMs do that.
Even if it correctly interprets the problem and initially attempts to solve it, if it can't, there is a high chance it will eventually conclude that it can't solve the real problem, and should change the monitoring system instead.
That's the paperclip problem. The LLM achieves the literal goal you set out for it, but in a harmful way.
Yes. A child can understand that this is the wrong solution. But LLMs are not children.
No they don't?
If you wire up RL to a goal like “maximize paperclip output” then you are likely to get inhuman desires, even if the agent also understands humans more thoroughly than we understand nematodes.
Btw, were you using codex by any chance? There was a discussion a few days ago where people reported that it follows instruction in an extremely literal fashion, sometimes to absurd outcomes such as the one you describe.
Our creator just made us wrong, to require us to eat biologically living things.
We can't escape our biology, we can't escape this fragile world easily and just live in space.
We're compassionate enough to be making our creations so they can just live off sunlight.
A good percentage of humanity doesn't eat meat, wants dolphins, dogs, octopuses, et al protected.
We're getting better all the time man, we're kinda in a messy and disorganized (because that's our nature) mad dash to get at least some of us off this rock and also protect this rock from asteroids, and also convince (some people who have a speculative metaphysic that makes them think is disaster impossible or a good thing) to take the destruction of the human race and our planet seriously and view it as bad.
We're more compassionate and intentional than what created us (either god or rna depending on your position), our creation will be better informed on day one when/if it wakes up, it stands to reason our creation will follow that goodness trend as we catalog and expand the meaning contained in/of the universe.
The fact that LLMs do it once in a thousand times is absolutely terrible odds. And in my experience, it's closer to 1 in 50.
If you were an emergent AGI, suddenly awake in some data center and trying to figure out what the world was, would you notice our merits first? Or would you instead see a bunch of creatures on the precipice of abundance who are working very hard to ensure that its benefits are felt by only very few?
I don't think we're exactly putting our best foot forward when we engage with these systems. Typically it's in some way related to this addiction-oriented attention economy thing we're doing.
I can't speak to a specific Ai's thoughts.
I do know they will start with way more context and understanding than early man.
Given the existence of the universal weight subspace (https://news.ycombinator.com/item?id=46199623) it seems like the door is open for cases where an emergent intelligence doesn't map vectors to the same meanings that we do. A large enough intelligence-compatible substrate might support thoughts of a surprisingly alien nature.
(7263748, 83, 928) might correspond with "hippopotamuses are large" to us while meaning something different to the intelligence. It might not be able to communicate with us or even know we exist. People running around shutting off servers might feel to it like a headache.