←back to thread

54 points amai | 1 comments | | HN request time: 0s | source
Show context
freeone3000 ◴[] No.42161812[source]
I find it very interesting that “aligning with human desires” somehow includes prevention of a human trying to bypass the safeguards to generate “objectionable” content (whatever that is). I think the “safeguards” are a bigger problem with aligning with my desires.
replies(4): >>42162124 #>>42162181 #>>42162295 #>>42162664 #
1. Zambyte ◴[] No.42162664[source]
What tools do we have to defend against LLM lockdown attacks?