←back to thread

358 points tkgally | 3 comments | | HN request time: 0s | source

The use of the em dash (—) now raises suspicions that a text might have been AI-generated. Inspired by a suggestion from dang [1], I created a leaderboard of HN users according to how many of their posts before November 30, 2022—that is, before the release of ChatGPT—contained em dashes. Dang himself comes in number 2—by a very slim margin.

Credit to Claude Code for showing me how to search the HN database through Google BigQuery and for writing the HTML for the leaderboard.

[1] https://news.ycombinator.com/item?id=45053933

Show context
IAmGraydon ◴[] No.45071916[source]
I guess I’m confused. Why is it interesting to know how many em dashes were used before the dawn of ChatGPT? It’s how many AFTER that seems like it would be far more interesting.
replies(4): >>45071977 #>>45071990 #>>45071991 #>>45072503 #
1. southwindcg ◴[] No.45071977[source]
Some people accuse anyone who uses em dashes of using ChatGPT to write their posts. This is "proof" that actual humans use em dashes.
replies(1): >>45072690 #
2. vntok ◴[] No.45072690[source]
Things like books are proof that actual humans use em dashes, that wasn't ever the contention.

What's needed is a writing comparison before/after 2022 for these users. If there's a sudden 200% increase in the use of em-dashes from one month to the next, it's a very strong indicator that the user started LLMing their posts.

replies(1): >>45078852 #
3. southwindcg ◴[] No.45078852[source]
Perhaps I should have qualified that humans use them in casual writing, website comments and the like, and not just in formal, published works that probably had an editor.