←back to thread

LLMs can get "brain rot"

(llm-brain-rot.github.io)
466 points tamnd | 3 comments | | HN request time: 0.693s | source
Show context
gaogao ◴[] No.45658984[source]
Brain rot texts seems reasonably harmful, but brain rot videos are often surreal and semantically dense in a way that probably improves performance (such as discussed on this German brain rot analysis https://www.youtube.com/watch?v=-mJENuEN_rs&t=37s). For example, Švankmajer is basically proto-brainrot, but is also the sort of thing you'd watch in a museum and think about.

Basically, I think the brain rot aspect might be a bit of terminology distraction here, when it seems what they're measuring is whether it's a puff piece or dense.

replies(2): >>45659079 #>>45659097 #
1. f_devd ◴[] No.45659079[source]
I do not think this is the case, there has been some research into brainrot videos for children[0], and it doesn't seem to trend positively. I would argue anything 'constructed' enough will not classify as far on the brainrot spectrum.

[0]: https://www.forbes.com/sites/traversmark/2024/05/17/why-kids...

replies(1): >>45659376 #
2. gaogao ◴[] No.45659376[source]
Yeah, I don't think surrealism or constructed is good in the early data mix, but as part of mid or post-training seems generally reasonable. But also, this is one of those cases where anthropomorphizing the model probably doesn't work, since a major negative effect of Cocomelon is kids only wanting to watch Cocomelon, while for large model training, it doesn't have much choice in the training data distribution.
replies(1): >>45666077 #
3. f_devd ◴[] No.45666077[source]
I would a agree a careful and very small amount of above brainrot in post-training could improve certain metrics, if the main dataset didn't contain any. But given how much data current LLMs consume and how much is being produced and put back into the cycle I doubt it will miss be missed