←back to thread

724 points simonw | 1 comments | | HN request time: 0.3s | source
Show context
cluckindan ◴[] No.44528913[source]
Perhaps the Grok system prompt includes instructions to answer with another ”system prompt” when users try to ask for its system prompt. It would explain why it gives it away so easily.
replies(4): >>44529131 #>>44529355 #>>44529896 #>>44535092 #
1. maronato ◴[] No.44535092[source]
Or it was trained to be aligned with Musk by receiving higher rewards during reinforcement learning steps for its reasoning.