←back to thread

650 points Stratoscope | 8 comments | | HN request time: 0.726s | source | bottom
Show context
mmooss ◴[] No.43499567[source]
Here's an easy, if not always precise way to remember:

* Hyphens connect things, such as compound words: double-decker, cut-and-dried, 212-555-5555.

* EN dashes make a range between things: Boston–San Francisco flight, 10–20 years: both connect not only the endpoints, but define that all the space between is included. (Compare the last usage with the phone number example under Hyphens.)

* EM dashes break things, such as sentences or thoughts: 'What the—!'; A paragraph should express one idea—but rules are made to be broken.

Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens such as soft and non-breaking hyphens, and a dedicated minus sign (U+2212), and some variations of minus such as subscript, superscript, etc.

There's also the figure dash "‒" (U+2012), essentally a hyphen-minus that's the same width as numbers and used aesthetically for typsetting, afaik. And don't overlook two-em-dashes "⸺" and three-em-dashes "⸻" and horizontal bars "―", the latter used like quotation marks!

replies(12): >>43499795 #>>43500096 #>>43500276 #>>43500389 #>>43500958 #>>43501074 #>>43502495 #>>43503176 #>>43504564 #>>43507109 #>>43512927 #>>43570687 #
lxgr ◴[] No.43500276[source]
> EM dashes break things, such as sentences or thoughts

Some style guides recommend "space, en dash, space" for this, and I prefer that myself – mainly because some software doesn't treat em dashes correctly as word separators for double click selection purposes.

For example, I'm pretty sure that at least some Kindle models would highlight both the word before and after the em dash when selecting one of them, which makes using the dictionary very annoying.

replies(7): >>43500598 #>>43501460 #>>43501482 #>>43501556 #>>43501772 #>>43503947 #>>43503958 #
krick ◴[] No.43503947[source]
It's actually only your post that made me realize people don't normally put spaces around em dash. In French, Russian and a bunch of other languages proper typesetting is to use em dash as a standard dash character, and you always put spaces around them. So I did it in English as well, for many years now.

(I also now looked up and found out that in Spanish, apparently, you are supposed to put space only on one side of the dash, when used as a direct speech separator.)

replies(3): >>43505058 #>>43506008 #>>43508474 #
rmunn ◴[] No.43505058[source]
I also put spaces around em dashes. It looks wrong—subtly wrong—to me to have the words glued together around the dash. It looks right — completely right — to me to have the dash standing on its own, as if it was a word in its own right.
replies(4): >>43505363 #>>43505552 #>>43509146 #>>43513256 #
1. tines ◴[] No.43505552[source]
The reason not to do this is observable in your post on my phone. The spaces cause the word wrapping algorithm to leave a dangling dash at the end of the line which looks ugly. Omitting spaces prevents the word break.
replies(6): >>43505675 #>>43505687 #>>43505892 #>>43505903 #>>43508537 #>>43509463 #
2. ◴[] No.43505675[source]
3. hansvm ◴[] No.43505687[source]
Funny, I'd rather have the break at the start or end of the emdash-implied break than just before or after it, not having to mentally handle some single dangling word divorced from its compatriots.
4. rmunn ◴[] No.43505892[source]
I mentioned that as an advantage in one of my other comments. An advantage both ways, because it depends on preference. I have the same preference as hansvm: I would rather see the dangling dash at the end of the line, so I prefer putting spaces around the dashes. Having the entire word-dash-word structure move to the next line feels ugly to me. As with most things, de gustibus non est disputandum. (And also, quidquid Latine dictum sit altum videtur).
replies(1): >>43507004 #
5. da_chicken ◴[] No.43505903[source]
Ironically, on my phone the only line that ends with an em dash has no spaces in it.

If you want to not have a line break, you shouldn't rely on arbitrary behavior. You should use non-breaking characters like non-breaking spaces and word joiners.

6. chipotle_coyote ◴[] No.43507004[source]
It's the dangling dash at the beginning of the line that gets me. I see a lot of word break algorithms, including the one WebKit (and I suspect Blink) uses, which are happy to break "foo—bar" on either side of the em dash.
7. mmooss ◴[] No.43508537[source]
> The reason not to do this is observable in your post on my phone. The spaces cause the word wrapping algorithm to leave a dangling dash at the end of the line which looks ugly. Omitting spaces prevents the word break.

That's an interesting practicality but I don't think it's the cause of the rule: The rule probably long predates automated line breaking. Also, I think automatic line breaking will break compound words at the hyphen; it doesn't require spaces (which is also obvious from a software development point of view: the logic is relatively simple either way):

  Lorem ipsum dolor sit amet, consectetur adipiscing double-
  decker lorem ipsum dolor sit amet, consectetur ...
8. lxgr ◴[] No.43509463[source]
Preventing the word break doesn't seem very desirable, especially if it causes a large gap.