←back to thread

358 points tkgally | 7 comments | | HN request time: 0.286s | source | bottom

The use of the em dash (—) now raises suspicions that a text might have been AI-generated. Inspired by a suggestion from dang [1], I created a leaderboard of HN users according to how many of their posts before November 30, 2022—that is, before the release of ChatGPT—contained em dashes. Dang himself comes in number 2—by a very slim margin.

Credit to Claude Code for showing me how to search the HN database through Google BigQuery and for writing the HTML for the leaderboard.

[1] https://news.ycombinator.com/item?id=45053933

1. userbinator ◴[] No.45071871[source]
I suspect they are generated via "autocorrect", the same way as "smart (more like stupid) quotes" and other characters that tend to cause a great deal of frustration should they find their way into source code. It would be interesting to see how many users regularly make posts containing non-ASCII characters.
replies(5): >>45071883 #>>45071891 #>>45071897 #>>45071898 #>>45072027 #
2. dang ◴[] No.45071883[source]
I'm only #2 but all mine are guaranteed hand-made, done this way: https://news.ycombinator.com/item?id=45071823
replies(1): >>45072925 #
3. db48x ◴[] No.45071891[source]
No, I modified my keymap to make typing quotes and dashes and other characters easy.
4. ◴[] No.45071897[source]
5. wiml ◴[] No.45071898[source]
I type them manually out of habit. There are a handful of other common non-ASCII marks I have muscle memory for as well.

Compose-minus-minus-minus in X

It's one of the long-press punctuation marks on Android

Option-shift-minus on Mac

6. southwindcg ◴[] No.45072027[source]
I use Autokey. I've added a bunch of occasionally-used HTML entities and Unicode characters so I don't need to go hunting for them.
7. lostlogin ◴[] No.45072925[source]
When the pre 2022 versus post 2022 stats come out, all will be revealed.