←back to thread

358 points tkgally | 2 comments | | HN request time: 0.576s | source

The use of the em dash (—) now raises suspicions that a text might have been AI-generated. Inspired by a suggestion from dang [1], I created a leaderboard of HN users according to how many of their posts before November 30, 2022—that is, before the release of ChatGPT—contained em dashes. Dang himself comes in number 2—by a very slim margin.

Credit to Claude Code for showing me how to search the HN database through Google BigQuery and for writing the HTML for the leaderboard.

[1] https://news.ycombinator.com/item?id=45053933

Show context
tptacek ◴[] No.45071905[source]
The em-dash giveaway is an actual Unicode em-dash character, right? I professionally had to learn Latex to write a paper in the 1990s and picked up a "---" habit ever since, and I've been wondering if that's some kind of weird LLM tell now.
replies(3): >>45071910 #>>45071948 #>>45072345 #
f33d5173 ◴[] No.45071910[source]
It's more the style of setting up contrasts that's the real llm tell. That they happen to use a typographic mark that most people don't know how to type is just fuel on the fire.
replies(4): >>45072153 #>>45072298 #>>45072695 #>>45073079 #
1. londons_explore ◴[] No.45072298[source]
Anyone who types in MS word for the improved spell checker and then copies their comment to a browser will automatically get hyphens changed to em-dashes.
replies(1): >>45074234 #
2. layer8 ◴[] No.45074234[source]
This is configurable and can be turned off.