←back to thread

724 points simonw | 2 comments | | HN request time: 0.511s | source
Show context
joshstrange ◴[] No.44532182[source]
> I think there is a good chance this behavior is unintended!

Ehh, given the person we are talking about (Elon) I think that's a little naive. They wouldn't need to add it in the system prompt, they could have just fine-tuned it and rewarded it when it tried to find Elon's opinion. He strikes me as the type of person who would absolutely do that given stories about him manipulating Twitter to "fix" his dropping engagement numbers.

This isn't fringe/conspiracy territory, it would be par for the course IMHO.

replies(1): >>44532309 #
simonw ◴[] No.44532309[source]
If I was Elon and I decided that Grok should search my tweets any time it needs to answer something controversial, I would also make sure it didn't say "Searching X for from:elonmusk" right there in the UI every time it did that.
replies(2): >>44532750 #>>44533502 #
1. serf ◴[] No.44533502[source]
you don't think a technical dev would let management foot-gun themselves like that with a stupid directive?

I do.

I don't have any sort of inkling that Musk has ever dog-fooded any single product he's been involved with. He can spout shit out about Grok all day in press interviews, I don't believe for a minute that he's ever used it or is even remotely familiar with how the UI/UX would work.

I do think that a dictator would instruct Dr Frankenstein to make his monster obey him (the dictator) at any costs, regardless of the dictator's biology/psychology skills.

replies(1): >>44533714 #
2. simonw ◴[] No.44533714[source]
I think it is possible that a developer, with or without Elon's direct instruction, decided to engineer Grok to search for Elon's tweets on controversial subjects and then either out of incompetence or malicious compliance set it up so those searches would be exposed in the UI.

I also think it is possible that nobody specifically designed that behavior, and it instead emerged from the way the model was trained.

My current intuition is that the second is more likely than the first.