So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions
So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions
AI needs an internal emotional state because that's what drives attention and memory. AI needs to want something.
There's probably lots of small signals of "the user is happy with the output" plus the longer the history the more it will converge on the middle of being what you want. Including when the user says "don't do [x]" which override past stuff.
Practically, for use with a codebase development effort, if the model remembers the original design decisions, the discussions about costs and benefits, then can remember all that much later in the process, it's going to start getting really good at thinking about what the next step is, or even to make decisions about when a major refactor is neede, etc.
I can see a product where you purchase a model that has basic training, and then, using the features outlined in the paper, it learns on the fly from your usage.
I can also see there being a secondary market for specially trained models, long-term memory filled with some specific skill, done in some specific way. To make a silly example, imagine buying a licence to Torvald's OS coding assistant, ready to insult your prs before you even commit them!(And possibly help you write code in Torvald's style too)
This would of course require Linus to use the model enough for it to learn,I won't comment on the likelihood of that happening: it's just a silly example after all
We would be INSANE to pursue giving that type of instincts to AIs.
So, if it would be bad thing for one to be made that “wants things” in any reasonable sense of the phrase, then it would probably be bad for J Random to be able to take a copy of a powerful AI and modify it in some way, because someone is likely to try doing that.
Of course, perhaps the best way to make sure that J Random doesn’t have the ability to do that, is to make sure no one does.
I have come to beleive that we will only be able to truly replicate intelligence if the system was trying to preserve itself. Its the biggest incentive ever to do intelligent things.
I mean it's not just automatic thing with no higher-level control.