I think superintelligence will turn out not to be a singularity, but as something with diminishing returns. They will be cool returns, just like a Brittanica set is nice to have at home, but strictly speaking, not required to your well-being.
I think superintelligence will turn out not to be a singularity, but as something with diminishing returns. They will be cool returns, just like a Brittanica set is nice to have at home, but strictly speaking, not required to your well-being.
Suppose you tell a coding LLM that your monitoring system has detected that the website is down and that it needs to find the problem and solve it. In that case, there's a non-zero chance that it will conclude that it needs to alter the monitoring system so that it can't detect the website's status anymore and always reports it as being up. That's today. LLMs do that.
Even if it correctly interprets the problem and initially attempts to solve it, if it can't, there is a high chance it will eventually conclude that it can't solve the real problem, and should change the monitoring system instead.
That's the paperclip problem. The LLM achieves the literal goal you set out for it, but in a harmful way.
Yes. A child can understand that this is the wrong solution. But LLMs are not children.
No they don't?
Btw, were you using codex by any chance? There was a discussion a few days ago where people reported that it follows instruction in an extremely literal fashion, sometimes to absurd outcomes such as the one you describe.
The fact that LLMs do it once in a thousand times is absolutely terrible odds. And in my experience, it's closer to 1 in 50.