←back to thread

650 points Stratoscope | 1 comments | | HN request time: 1.496s | source
Show context
mmooss ◴[] No.43499567[source]
Here's an easy, if not always precise way to remember:

* Hyphens connect things, such as compound words: double-decker, cut-and-dried, 212-555-5555.

* EN dashes make a range between things: Boston–San Francisco flight, 10–20 years: both connect not only the endpoints, but define that all the space between is included. (Compare the last usage with the phone number example under Hyphens.)

* EM dashes break things, such as sentences or thoughts: 'What the—!'; A paragraph should express one idea—but rules are made to be broken.

Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens such as soft and non-breaking hyphens, and a dedicated minus sign (U+2212), and some variations of minus such as subscript, superscript, etc.

There's also the figure dash "‒" (U+2012), essentally a hyphen-minus that's the same width as numbers and used aesthetically for typsetting, afaik. And don't overlook two-em-dashes "⸺" and three-em-dashes "⸻" and horizontal bars "―", the latter used like quotation marks!

replies(12): >>43499795 #>>43500096 #>>43500276 #>>43500389 #>>43500958 #>>43501074 #>>43502495 #>>43503176 #>>43504564 #>>43507109 #>>43512927 #>>43570687 #
lxgr ◴[] No.43500276[source]
> EM dashes break things, such as sentences or thoughts

Some style guides recommend "space, en dash, space" for this, and I prefer that myself – mainly because some software doesn't treat em dashes correctly as word separators for double click selection purposes.

For example, I'm pretty sure that at least some Kindle models would highlight both the word before and after the em dash when selecting one of them, which makes using the dictionary very annoying.

replies(7): >>43500598 #>>43501460 #>>43501482 #>>43501556 #>>43501772 #>>43503947 #>>43503958 #
opello ◴[] No.43501772[source]
> Some style guides recommend "space, en dash, space" for this

The last paragraph of the article also addressed the subjective nature of spacing around the em dash:

> Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it.

As far as the selection detail, did you mean that you replace an em dash used like a comma or parenthesis with spaces and an en dash for specific highlight performance issues? Surely the spaces and an em dash would alleviate the selection highlight behavior and not muddy the waters of when to use an em vs. an en dash?

replies(1): >>43505664 #
JadeNB ◴[] No.43505664[source]
> Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it.

It's funny that they omit to mention the possibility of setting it off with a thin space ' ' or hair space ' ' (those are the thin-space and hair-space Unicode characters, though they show up full width for me), which I thought was preferred typographic practice.

(On Googling, maybe the reason that they don't mention it is that I was imagining it; I can't find any evidence for my belief.)

replies(1): >>43506827 #
opello ◴[] No.43506827[source]
> those are the thin-space and hair-space Unicode characters, though they show up full width for me

Interestingly, at least in my browser and grabbing the direct link to the comment with curl, show the bytes as 0x20 for both. Perhaps the comment submission handler, or even the browser, collated your more specific U+2009 (thin) and U+200A (hair) spaces into the regular U+0020 space?

replies(1): >>43507608 #
1. JadeNB ◴[] No.43507608[source]
> Interestingly, at least in my browser and grabbing the direct link to the comment with curl, show the bytes as 0x20 for both. Perhaps the comment submission handler, or even the browser, collated your more specific U+2009 (thin) and U+200A (hair) spaces into the regular U+0020 space?

Probably! I think HN strips out emoji; maybe it just takes the safest approach and strips out all non-white-listed Unicode.