←back to thread

358 points tkgally | 1 comments | | HN request time: 0s | source

The use of the em dash (—) now raises suspicions that a text might have been AI-generated. Inspired by a suggestion from dang [1], I created a leaderboard of HN users according to how many of their posts before November 30, 2022—that is, before the release of ChatGPT—contained em dashes. Dang himself comes in number 2—by a very slim margin.

Credit to Claude Code for showing me how to search the HN database through Google BigQuery and for writing the HTML for the leaderboard.

[1] https://news.ycombinator.com/item?id=45053933

Show context
sjs382 ◴[] No.45074990[source]
You can count your own with this snippet. Just replace my username with your own. My count before this comment was 46.

  curl -s "https://hn.algolia.com/api/v1/search?tags=comment,author_sjs382&hitsPerPage=10000" \
    | jq -r '.hits[].comment_text' \
    | grep -o "—" \
    | wc -l
replies(1): >>45075390 #
1. Rendello ◴[] No.45075390[source]
This script is awesome. I checked for "—" (em), "–" (en), and "--", along with other random strings.