←back to thread

LLMs can get "brain rot"

(llm-brain-rot.github.io)
466 points tamnd | 1 comments | | HN request time: 0s | source
Show context
avazhi ◴[] No.45658886[source]
“Studying “Brain Rot” for LLMs isn’t just a catchy metaphor—it reframes data curation as cognitive hygiene for AI, guiding how we source, filter, and maintain training corpora so deployed systems stay sharp, reliable, and aligned over time.”

An LLM-written line if I’ve ever seen one. Looks like the authors have their own brainrot to contend with.

replies(12): >>45658899 #>>45660532 #>>45661492 #>>45662138 #>>45662241 #>>45664417 #>>45664474 #>>45665028 #>>45668042 #>>45670485 #>>45670910 #>>45671621 #
standardly ◴[] No.45660532[source]
That is indeed an LLM-written sentence — not only does it employ an em dash, but also lists objects in a series — twice within the same sentence — typical LLM behavior that renders its output conspicuous, obvious, and readily apparent to HN readers.
replies(15): >>45660603 #>>45660625 #>>45660648 #>>45660736 #>>45660769 #>>45660781 #>>45660816 #>>45662051 #>>45664698 #>>45665777 #>>45666311 #>>45667269 #>>45670534 #>>45678811 #>>45687737 #
turtletontine ◴[] No.45660736[source]
I think this article has already made the rounds here, but I still think about it. I love using em dashes! It really makes me sad that I need to avoid them now to sound human

https://bassi.li/articles/i-miss-using-em-dashes

replies(13): >>45660868 #>>45661962 #>>45663044 #>>45663414 #>>45663533 #>>45663715 #>>45664775 #>>45665728 #>>45665739 #>>45665745 #>>45665925 #>>45667267 #>>45667708 #
janderson215 ◴[] No.45660868[source]
The em dash usage conundrum is likely temporary. If I were you, I’d continue using them however you previously used them and someday soon, you’ll be ignored the same way everybody else is once AI mimics innumerable punctuation and grammatical patterns.
replies(2): >>45662559 #>>45663347 #
astrange ◴[] No.45662559[source]
They didn't always em-dash. I expect it's intentional as a watermark.

Other buzzwords you can spot are "wild" and "vibes".

replies(4): >>45662845 #>>45663827 #>>45664982 #>>45667323 #
kragen ◴[] No.45667323{3}[source]
I suspect it's a spandrel of some other feature of their training. Presumably em dashes occur disproportionately often in high-quality human-written text, so training LLMs to imitate high-quality human-written text instead of random IRC logs and 4chan trolls results in them also imitating high-quality typography.
replies(1): >>45677337 #
astrange ◴[] No.45677337{4}[source]
Nah, because it's new. 3.5 didn't emdash and I don't think 4 even did.

Besides, LLMs' basin of high quality text is Wikipedia.

replies(1): >>45683913 #
1. kragen ◴[] No.45683913{5}[source]
Wikipedia is full of em dashes.