The author’s inability to imagine a model that’s superficially useful but dangerously misaligned betrays their lack of awareness of incredibly basic AI safety concepts that are literally decades old.
replies(1):
Starting points:
https://www.lesswrong.com/posts/zthDPAjh9w6Ytbeks/deceptive-...