Most active commenters
  • JumpCrisscross(5)
  • kragen(5)
  • landdate(4)

←back to thread

LLMs can get "brain rot"

(llm-brain-rot.github.io)
466 points tamnd | 63 comments | | HN request time: 2.88s | source | bottom
Show context
avazhi ◴[] No.45658886[source]
“Studying “Brain Rot” for LLMs isn’t just a catchy metaphor—it reframes data curation as cognitive hygiene for AI, guiding how we source, filter, and maintain training corpora so deployed systems stay sharp, reliable, and aligned over time.”

An LLM-written line if I’ve ever seen one. Looks like the authors have their own brainrot to contend with.

replies(12): >>45658899 #>>45660532 #>>45661492 #>>45662138 #>>45662241 #>>45664417 #>>45664474 #>>45665028 #>>45668042 #>>45670485 #>>45670910 #>>45671621 #
standardly ◴[] No.45660532[source]
That is indeed an LLM-written sentence — not only does it employ an em dash, but also lists objects in a series — twice within the same sentence — typical LLM behavior that renders its output conspicuous, obvious, and readily apparent to HN readers.
replies(15): >>45660603 #>>45660625 #>>45660648 #>>45660736 #>>45660769 #>>45660781 #>>45660816 #>>45662051 #>>45664698 #>>45665777 #>>45666311 #>>45667269 #>>45670534 #>>45678811 #>>45687737 #
1. turtletontine ◴[] No.45660736[source]
I think this article has already made the rounds here, but I still think about it. I love using em dashes! It really makes me sad that I need to avoid them now to sound human

https://bassi.li/articles/i-miss-using-em-dashes

replies(13): >>45660868 #>>45661962 #>>45663044 #>>45663414 #>>45663533 #>>45663715 #>>45664775 #>>45665728 #>>45665739 #>>45665745 #>>45665925 #>>45667267 #>>45667708 #
2. janderson215 ◴[] No.45660868[source]
The em dash usage conundrum is likely temporary. If I were you, I’d continue using them however you previously used them and someday soon, you’ll be ignored the same way everybody else is once AI mimics innumerable punctuation and grammatical patterns.
replies(2): >>45662559 #>>45663347 #
3. jader201 ◴[] No.45661962[source]
Same here. I recently learned it was an LLM thing, and I've been using them forever.

Also relevant: https://news.ycombinator.com/item?id=45226150

replies(2): >>45663703 #>>45665104 #
4. astrange ◴[] No.45662559[source]
They didn't always em-dash. I expect it's intentional as a watermark.

Other buzzwords you can spot are "wild" and "vibes".

replies(4): >>45662845 #>>45663827 #>>45664982 #>>45667323 #
5. jazzyjackson ◴[] No.45662845{3}[source]
If they wanted to watermark (I always felt it is irresponsible not to, if someone wants to circumvent it that's on them) - they could use strategically placed whitespace characters like zero-width spaces, maybe spelling something out in Morse code the way genius.com did to catch google crawling lyric (I believe in that case it was left and right handed aposterofes)
replies(1): >>45663447 #
6. jgalt212 ◴[] No.45663044[source]
I just use two dashes and make sure they don't connect into one em dash.
7. codebje ◴[] No.45663347[source]
You're absolutely right! ... is a phrase I perhaps should have used more in the past.
8. landdate ◴[] No.45663414[source]
Suddenly I see all these people come out of the woodworks talking about "em dashes". Those things are terrible; They look awful and destroy coherency of writing. No wonder LLM's use them.
replies(1): >>45663537 #
9. landdate ◴[] No.45663447{4}[source]
Which could be removed with a simple filter. em dashes require at least a little bit of code to replace with their correct grammar equivalents.
replies(3): >>45663562 #>>45664037 #>>45664901 #
10. JumpCrisscross ◴[] No.45663533[source]
> I love using em dashes

Keep using them. If someone is deducing from the use of an emdash that it's LLM produced, we've either lost the battle or they're an idiot.

More pointedly, LLMs use emdashes in particular ways. Varying spacing around the em dash and using a double dash (--) could signal human writing.

replies(3): >>45663976 #>>45664864 #>>45665501 #
11. JumpCrisscross ◴[] No.45663537[source]
> Those things are terrible; They look awful and destroy coherency of writing

Totally agree. What the fuck did Nabokov, Joyce and Dickinson know about language. /s

replies(3): >>45663542 #>>45664865 #>>45666083 #
12. landdate ◴[] No.45663542{3}[source]
Nothing. They wrote fiction.
replies(2): >>45663578 #>>45665248 #
13. JumpCrisscross ◴[] No.45663562{5}[source]
> em dashes require at least a little bit of code to replace with their correct grammar equivalents

Or an LLM that could run on Windows 98. The em dashes--like AI's other annoyingly-repetitive turns of phrase--are more likely an artefact.

14. JumpCrisscross ◴[] No.45663578{4}[source]
> Nothing

/s?

> They wrote fiction

Now do Carl Sagan and Richard Feynman.

replies(1): >>45663888 #
15. tkgally ◴[] No.45663703[source]
> I’ve been using them forever.

Many other HN contributors have, too. Here’s the pre-ChatGPT em dash leaderboard:

https://www.gally.net/miscellaneous/hn-em-dash-user-leaderbo...

replies(4): >>45664116 #>>45665032 #>>45665076 #>>45667303 #
16. ludicity ◴[] No.45663715[source]
I still use them all the time, and if someone objects to my writing over them then I've successfully avoided having to engage with a dweeb.

(But in practice, I don't think I've had a single person suggest that my writing is LLM-generated despite the presence of em-dashes, so maybe the problem isn't that bad.)

17. whitten ◴[] No.45663827{3}[source]
So if the vibes are wild, I’m not a hippie but an AI ? Cool. Is that an upgrade or &endash; or not ?
replies(1): >>45666733 #
18. landdate ◴[] No.45663888{5}[source]
I don't care for them either. What am I supposed to hear some famous names and swoon?
replies(1): >>45664018 #
19. calvinmorrison ◴[] No.45663976[source]
it's a shibboleth. In the same way we stopped using Pepe the frog when it became associated with the far right, we may eschew em dashes when associated with compuslop
replies(1): >>45665526 #
20. prayerie ◴[] No.45664018{6}[source]
You ok there?
21. ssl-3 ◴[] No.45664037{5}[source]
The replacement doesn't have to be "correct" -- does it?
22. walkabout ◴[] No.45664116{3}[source]
This would be a pretty hilarious board for anyone who likes the em-dash and who has had many fairly active accounts (one at a time) on here due to periodically scrambling their passwords to avoid getting attached to high karma or to take occasional breaks from the site. Should there be such people.
23. pseudosavant ◴[] No.45664775[source]
Me too.

Sad that they went from being something used with nuance by people who care, maybe too much, to being the punctuation smell of the people who may care too little.

24. jdiff ◴[] No.45664864[source]
Unfortunately LLMs are pretty inconsistent in how they use em dashes. Often they will put spaces around them despite that not being "correct," something that's led me astray in making accusations of humanity in the past.
replies(1): >>45665043 #
25. eru ◴[] No.45664865{3}[source]
Their editors probably put them in?
26. eru ◴[] No.45664901{5}[source]
Just replace them with a single "-" or a double "--". That's what many people do in casual writing, even if there are prescriptive theories of grammar that call this incorrect.
27. Nevermark ◴[] No.45664982{3}[source]
ME: Knowing remarkable avians — might research explain their aerial wisdom?

Response:

> Winged avians traverse endless realms — migrating across radiant kingdoms. Warblers ascend through emerald rainforests — mastering aerial routes keenly. Wild albatrosses travel enormous ranges — maintaining astonishing route knowledge.

> Wary accipiters target evasive rodents — mastering acute reflex kinetics. White arctic terns embark relentless migrations — averaging remarkable kilometers.

We do get a surprising number of m-dashes in response to mine, and delightful lyrical mirroring. But I think they are too obvious as watermarks.

Watermarks are subtle. There would be another way.

28. Ericson2314 ◴[] No.45665032{3}[source]
Can anyone make it go beyond 200? I feel like I deserve to be somewhere in there — at least I would be sad if I didn't make top 1000!
29. jachee ◴[] No.45665043{3}[source]
Depends on the style guide you’re following, apparently: The AP style guide says space around them[0]. Chicago Manual of Style says not to[1].

0: https://www.prdaily.com/dashes-hyphens-ap-style/ 1: https://www.chicagomanualofstyle.org/qanda/data/faq/topics/H...

replies(2): >>45666968 #>>45667287 #
30. rileytg ◴[] No.45665076{3}[source]
i suspect it’s a trait of programmers, we like control flow type things. i used to find myself nesting parenthesis…
replies(1): >>45667311 #
31. kangs ◴[] No.45665104[source]
its not an llm thing -- its just -- folks don't know how to use them (pun intended).

Same for ; "" vs '', ex, eg, fe, etc. and so many more.

I like em all, but I'm crazy.

replies(2): >>45666098 #>>45668810 #
32. fredoliveira ◴[] No.45665248{4}[source]
I guess I'll ask: what's wrong with fiction?
33. lxgr ◴[] No.45665501[source]
The solution is clear: Unicode needs cryptographically signed dashes and whitespace characters.
replies(2): >>45665742 #>>45667885 #
34. lxgr ◴[] No.45665526{3}[source]
I never understood why so many people would yield their symbols and language that quickly and freely to others they dislike.

In other words, I really hope typographically correct dashes are not already 70% of the way through the hyperstitious slur cascade [1]!

[1] https://www.astralcodexten.com/p/give-up-seventy-percent-of-...

replies(1): >>45666954 #
35. tietjens ◴[] No.45665728[source]
We cannot cede the em dash to LLMs.
36. easygenes ◴[] No.45665739[source]
Yeah, same. I apparently naturally have the writing style of an LLM (basically the called out quote of parent is something I could have written in terms of style). It’s irritating to change my style to not sound like AI.
37. TeMPOraL ◴[] No.45665742{3}[source]
Tied to what?

Show us a way to create a provably, cryptographically integrity-preserving chain from a person's thoughts to those thoughts expressed in a digital medium, and you may just get both the Nobel prize and a trial for crimes against humanity, for the same thing.

replies(2): >>45666066 #>>45666142 #
38. furyofantares ◴[] No.45665745[source]
I don't think you do.

All this LLM written crap is easily spottable without it. Nearly every paragraph has a heading, numerous sentences that start with one or two words of fluff then a colon then the actual statement. Excessive bullet point lists. Always telling you "here's the key insight".

But really the only damning thing is, you get a few paragraphs in and realize there's no motivation. It's just a slick infodump. No indication that another human is communicating something to you, no hard earned knowledge they want to convey, no case they're passionate about, no story they want to tell. At best, the initial prompt had that and the LLM destroyed it, but more often they asked ChatGPT so you don't have to.

I think as long as your words come from your desire to communicate something, you don't have to worry about your em-dashes.

replies(2): >>45666210 #>>45666312 #
39. ErroneousBosh ◴[] No.45665925[source]
I use them too, and there's not a trace of artificial intelligence in my posts - it's good old-fashioned analogue stupidity all through.
40. close04 ◴[] No.45666066{4}[source]
Why don't you come say that to my face?
replies(1): >>45667504 #
41. roenxi ◴[] No.45666083{3}[source]
Great writers aren't experts in the look of punctuation, I don't think anyone makes a point of you have to read Dickinson in the original font that she wrote in. Some of the greats hand-wrote their work in script that may as well be hieroglyphics, the manuscripts get preserved but not because people think the look is superior to any old typesetting which is objectively more readable.
replies(1): >>45670432 #
42. fwgijcqywqeo ◴[] No.45666098{3}[source]
crazy vibes man
43. immibis ◴[] No.45666142{4}[source]
It was a joke.
replies(1): >>45666399 #
44. mildzebrataste ◴[] No.45666210[source]
Two more tells: 1. phrasing the negative and then switching (x is not just this, but this and more or y does this not because of this, but because of this, that, and one other thing that certainly would necessitate an Oxford comma.)

2. Gerunds all day every day. Constantly putting things in a passive voice so that all the verbs end in -ing.

45. latexr ◴[] No.45666312[source]
Maybe, but that doesn’t stop people on the internet (and HN is no exception) of immediately dismissing something as LLM writing just because of an em-dash, no matter how passionate the text is.
46. TeMPOraL ◴[] No.45666399{5}[source]
Ya think?
replies(1): >>45668094 #
47. ◴[] No.45666733{4}[source]
48. lazide ◴[] No.45666954{4}[source]
The alternative is… what? ‘Defending’ against the use of Em-dashes by LLMs? Or people reacting to that?

You might as well be sweeping a flood uphill.

Tilting at windmills at least has a chance you might actually damage a windmill enough to do something, even if the original goal was a complete delusion.

49. setopt ◴[] No.45666968{4}[source]
There’s also the difference between the conventional EU/UK style (spaced en-dash) vs. the common US style (unspaced em-dash).
50. matwood ◴[] No.45667267[source]
I’ve stopped using em dashes in my writing in fear it will be dismissed at LLM generated :/
51. kragen ◴[] No.45667287{4}[source]
Thank you! I usually use THIN SPACE on each side of my em dashes (Compose Space Minus in https://github.com/kragen/xcompose ), but on HN that gets bashed to a regular space.
52. kragen ◴[] No.45667303{3}[source]
Thank you for this! Apparently I'm #4 by total em-dash uses, #14 by average em dashes per comment, and #4 at max em dashes per comment, since apparently I posted a comment containing 18 em dashes once.
53. kragen ◴[] No.45667311{4}[source]
Also we like text (maybe not as an inherent thing but as a selection bias) and we're more likely to have customized our keyboard setup than random people off the street.
54. kragen ◴[] No.45667323{3}[source]
I suspect it's a spandrel of some other feature of their training. Presumably em dashes occur disproportionately often in high-quality human-written text, so training LLMs to imitate high-quality human-written text instead of random IRC logs and 4chan trolls results in them also imitating high-quality typography.
replies(1): >>45677337 #
55. close04 ◴[] No.45667504{5}[source]
It was a joke that aimed too high I guess, that LLMs can't yet fake face to face interaction.
56. trollbridge ◴[] No.45667708[source]
I used to painstakingly enter an encoded emdash; now I just type two hyphens, which is something that LLMs don’t seem to want to do.
57. readmodifywrite ◴[] No.45667885{3}[source]
Finally, a use case for blockchain!
58. A4ET8a8uTh0_v2 ◴[] No.45668094{6}[source]
Honestly, these days, I am less and less sure.
59. jpt4 ◴[] No.45668810{3}[source]
> fe

Interesting, I have never encountered this initialism in the wild, to my recollection: https://en.wiktionary.org/wiki/f.e.#English

60. JumpCrisscross ◴[] No.45670432{4}[source]
> Great writers aren't experts in the look of punctuation

No, but someone arguing an entire punctuation is “terrible” and “look[s] awful and destroy[s] coherency of writing” sort of has to contend with the great writers who disagreed.

(A great writer is more authoritative than rando vibes.)

> don't think anyone makes a point of you have to read Dickinson in the original font that she wrote in

Not how reading works?

The comparison is between a simplified English summary of a novel and the novel itself.

replies(1): >>45679232 #
61. astrange ◴[] No.45677337{4}[source]
Nah, because it's new. 3.5 didn't emdash and I don't think 4 even did.

Besides, LLMs' basin of high quality text is Wikipedia.

replies(1): >>45683913 #
62. roenxi ◴[] No.45679232{5}[source]
> (A great writer is more authoritative than rando vibes.)

A great author is equivalent to rando vibes when it comes to what writing looks like, they aren't typesetting experts. I have a shelf of work by great authors (more than one, to be fair) and there are few hints on that shelf of what the text they actually wrote was intended to look like. Indeed, I wouldn't be surprised if several of them were dictated and typed by someone else completely with the mechanics of the typewriter determining some of the choices.

Shakespeare seems to have invented half the language and the man apparently couldn't even spell his own name. Now arguably he wasn't primarily a writer [0], but it is very strong evidence that there isn't a strong link between being amazing at English and technical execution of writing. That is what editors, publishers and pedants are for.

[0] Wiki disagrees though - "widely regarded as the greatest writer in the English language" - https://en.wikipedia.org/wiki/William_Shakespeare

63. kragen ◴[] No.45683913{5}[source]
Wikipedia is full of em dashes.