←back to thread

358 points tkgally | 1 comments | | HN request time: 0.218s | source

The use of the em dash (—) now raises suspicions that a text might have been AI-generated. Inspired by a suggestion from dang [1], I created a leaderboard of HN users according to how many of their posts before November 30, 2022—that is, before the release of ChatGPT—contained em dashes. Dang himself comes in number 2—by a very slim margin.

Credit to Claude Code for showing me how to search the HN database through Google BigQuery and for writing the HTML for the leaderboard.

[1] https://news.ycombinator.com/item?id=45053933

1. ks2048 ◴[] No.45078599[source]
A related question - if you feed each comment into an LLM and asked it to classify into {human-produced, llm-produced, not-sure}, how many would it think are from LLMs? How could you try to investigate the true answer?