←back to thread

724 points simonw | 3 comments | | HN request time: 0.761s | source
Show context
cluckindan ◴[] No.44528913[source]
Perhaps the Grok system prompt includes instructions to answer with another ”system prompt” when users try to ask for its system prompt. It would explain why it gives it away so easily.
replies(4): >>44529131 #>>44529355 #>>44529896 #>>44535092 #
1. neuroticnews25 ◴[] No.44529355[source]
That would make Grok the only model capable of protecting its real system prompt from leaking?
replies(1): >>44529868 #
2. rsynnott ◴[] No.44529868[source]
Well, for this version people have only been trying for a day or so.
replies(1): >>44533864 #
3. cluckindan ◴[] No.44533864[source]
Providing a fake system prompt would make such jailbreaking very unlikely to succeed unless the jailbreak prompt explicitly accounts for that particular instruction.