Linda Yaccarino is leaving X

1. ceejayoz ◴[09 Jul 25 16:22 UTC] No.44511884[source]▶

The other LLMs don't have a "disbelieve reputable sources" unsafety prompt added at the owner's instructions.

replies(2): >>44511947 #>>44512590 #

2. steveBK123 ◴[09 Jul 25 16:27 UTC] No.44511947[source]▶

It's gotta be more than that too though. Maybe training data other companies won't touch? Hidden prompt they aren't publishing? Etc.

Clearly Musk has put his hand on the scale in multiple ways.

replies(4): >>44512280 #>>44512305 #>>44513674 #>>44515749 #

3. overfeed ◴[09 Jul 25 16:54 UTC] No.44512280{3}[source]▶

>>44511947 #

> Maybe training data other companies won't touch

That's a bingo. 3 weeks ago, Musk invited[1] X users to Microsoft-Tay[2] Grok by having them share share "divisive facts", then presumably fed the over 10,000 responses into the training/fine-tuning data set.

1. https://x.com/elonmusk/status/1936493967320953090

2. In 2016, Microsoft decided to let its Tay chatbot interact, and learn from Twitter users, and was praising Hitler in short order. They did it twice too, before shutting it down permanently. https://en.m.wikipedia.org/wiki/Tay_(chatbot)

replies(1): >>44516377 #

4. thrance ◴[09 Jul 25 16:56 UTC] No.44512305{3}[source]▶

>>44511947 #

I think they just told grok to favor conservative "sources" and it became "mechahitler" as the result.

5. empath75 ◴[09 Jul 25 17:06 UTC] No.44512445[source]▶

>>44511769 (TP) #

All LLM's are capable of producing really vile completions if prompted correctly -- after all, there's a lot of vile content in the training data. OpenAI does a lot of work fine tuning them to steer them away from it. It's just as easy to fine tune them to produce more.

In fact, there was an interesting paper showed that fine tuning an LLM to produce malicious code (ie: with just malicious code examples in response to questions, no other prompts), causes it to produce more "evil" results in completely unrelated tasks. So it's going to be hard for Musk to cherry pick particular "evil" responses in fine tuning without slanting everything it does in that direction.

replies(1): >>44513710 #

6. neuroelectron ◴[09 Jul 25 17:18 UTC] No.44512590[source]▶

>>44511884 #

Tbf, it must be difficult for LLMs to align all the WWII propaganda that's still floating around.

replies(1): >>44513520 #

7. redox99 ◴[09 Jul 25 17:26 UTC] No.44512678[source]▶

>>44511769 (TP) #

They had literally added (and now removed) a system prompt to be politically incorrect. I'm sure no other LLM has that.

https://github.com/xai-org/grok-prompts/commit/c5de4a14feb50...

8. ◴[09 Jul 25 17:27 UTC] No.44512687[source]▶

>>44511769 (TP) #

9. Macha ◴[09 Jul 25 18:47 UTC] No.44513520{3}[source]▶

>>44512590 #

Given the source of training data is primarily the internet, and not say scanned propaganda posters in museums, I'd have to imagine all the analyses or things attributed to the impact of world war 2 significantly outnumber uncritical publications of ww2 propaganda in the training sets.

replies(1): >>44525261 #

10. peab ◴[09 Jul 25 19:03 UTC] No.44513674{3}[source]▶

>>44511947 #

I think it's more so that they push changes quickly without exhaustively testing. Compare that to Google, who sits on a model for years for fear of hurting their reputation, or OpenAI and Anthropic who extensively red teams models

replies(1): >>44515043 #

11. lukas099 ◴[09 Jul 25 19:07 UTC] No.44513710[source]▶

>>44512445 #

Could you use one LLM to filter out such bad training data before using it to train another one? Do they do this already?

replies(1): >>44521154 #

12. steveBK123 ◴[09 Jul 25 21:48 UTC] No.44515043{4}[source]▶

>>44513674 #

Why does Grok keep "failing" in the same directional way if its just a testing issue?

13. intalentive ◴[09 Jul 25 22:39 UTC] No.44515425[source]▶

>>44511769 (TP) #

I suspect it has more to do with alignment fine-tuning.

14. bikezen ◴[09 Jul 25 23:27 UTC] No.44515749{3}[source]▶

>>44511947 #

It was starting N.... chains yesterday along with several other 4chan memes, so its definitely ingested a dataset consisting of at least 4chan posts that any sane company wouldn't touch with a 1000ft pole.

15. epakai ◴[10 Jul 25 01:34 UTC] No.44516377{4}[source]▶

>>44512280 #

That tweet seems like the bigger story.

I've seen lots of deflection saying Yaccarino chose to retire prior to Grok/MechaHitler, but the tweet predates that.

Even more deflection about how chatbots are easy to bait into saying weird things, but you don't need to bait when it has been specifically trained on it.

All of this was intentional. Musk is removing more of the mask, and he doesn't need Yaccarino to comfort advertisers any more.

16. empath75 ◴[10 Jul 25 13:56 UTC] No.44521154{3}[source]▶

>>44513710 #

You don't actually want to filter out "bad" training data. That this stuff exists is an important fact about the world. It's mostly just fine tuning to make sure it produces output that align with whatever values you want it to have. The models do assign a moral dimension to all of it's concepts, so if you fine tune it so that it's completions match your desired value system, it'll generally do what you expect, even if somewhere deep in the data set there is training data diametrically opposed to it.

17. neuroelectron ◴[10 Jul 25 20:34 UTC] No.44525261{4}[source]▶

>>44513520 #

What