←back to thread

358 points tkgally | 9 comments | | HN request time: 0.492s | source | bottom

The use of the em dash (—) now raises suspicions that a text might have been AI-generated. Inspired by a suggestion from dang [1], I created a leaderboard of HN users according to how many of their posts before November 30, 2022—that is, before the release of ChatGPT—contained em dashes. Dang himself comes in number 2—by a very slim margin.

Credit to Claude Code for showing me how to search the HN database through Google BigQuery and for writing the HTML for the leaderboard.

[1] https://news.ycombinator.com/item?id=45053933

1. rasse ◴[] No.45072058[source]
How about en dash usage? Has that been used as a similar false indicator?
replies(2): >>45072074 #>>45072329 #
2. ◴[] No.45072074[source]
3. thomasm6m6 ◴[] No.45072329[source]
OpenAI’s o3 was big on en dashes—one time it produced a Deep Research result containing >200 of them. I’m not aware of any other LLM using them commonly, though. I’d guess humans use them even less often; I don’t think Apple auto-inserts en dashes, and very few people (myself being one) are pedantic enough to bother.

On the other hand, I don’t think o3 was ever a common choice among people copying from LLMs, so en dashes remain infrequent regardless.

replies(2): >>45072593 #>>45073510 #
4. aspect0545 ◴[] No.45072593[source]
In German en dashes are more common than em dashes. I’ve been using them regularly for at least 20 years, both in German and English texts. I never liked it when people just threw in ordinary hyphen instead of an en dash, but few people note the difference.
replies(1): >>45072834 #
5. JimDabell ◴[] No.45072834{3}[source]
Yes, this is regional – British usage tends to be an en dash surrounded by spaces, where American usage tends to be an em dash with no spaces.
replies(1): >>45072913 #
6. lostlogin ◴[] No.45072913{4}[source]
All this has me thinking. Is the em-dash like an accent for machines?
replies(1): >>45073148 #
7. JimDabell ◴[] No.45073148{5}[source]
I’m not sure about accent, but I have described their intense overuse of certain things as a verbal tic before.
8. ascorbic ◴[] No.45073510[source]
They're very easy to type on a Mac though (opt+-). I've always used spaced en dashes without realising that that is the more common British style. Unspaced em dashes just look wrong to me.
replies(1): >>45075277 #
9. rectang ◴[] No.45075277{3}[source]
Unspaced em dashes look wrong too me too in most web contexts, but I think it’s typography-dependency and they look good in serif text when very large and heavy compared to other elements.