Most active commenters
  • lxgr(8)
  • mmooss(8)
  • rahimnathwani(4)
  • rmunn(4)
  • JadeNB(3)
  • snozolli(3)

←back to thread

650 points Stratoscope | 62 comments | | HN request time: 0.856s | source | bottom
Show context
mmooss ◴[] No.43499567[source]
Here's an easy, if not always precise way to remember:

* Hyphens connect things, such as compound words: double-decker, cut-and-dried, 212-555-5555.

* EN dashes make a range between things: Boston–San Francisco flight, 10–20 years: both connect not only the endpoints, but define that all the space between is included. (Compare the last usage with the phone number example under Hyphens.)

* EM dashes break things, such as sentences or thoughts: 'What the—!'; A paragraph should express one idea—but rules are made to be broken.

Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens such as soft and non-breaking hyphens, and a dedicated minus sign (U+2212), and some variations of minus such as subscript, superscript, etc.

There's also the figure dash "‒" (U+2012), essentally a hyphen-minus that's the same width as numbers and used aesthetically for typsetting, afaik. And don't overlook two-em-dashes "⸺" and three-em-dashes "⸻" and horizontal bars "―", the latter used like quotation marks!

replies(12): >>43499795 #>>43500096 #>>43500276 #>>43500389 #>>43500958 #>>43501074 #>>43502495 #>>43503176 #>>43504564 #>>43507109 #>>43512927 #>>43570687 #
1. lxgr ◴[] No.43500276[source]
> EM dashes break things, such as sentences or thoughts

Some style guides recommend "space, en dash, space" for this, and I prefer that myself – mainly because some software doesn't treat em dashes correctly as word separators for double click selection purposes.

For example, I'm pretty sure that at least some Kindle models would highlight both the word before and after the em dash when selecting one of them, which makes using the dictionary very annoying.

replies(7): >>43500598 #>>43501460 #>>43501482 #>>43501556 #>>43501772 #>>43503947 #>>43503958 #
2. rahimnathwani ◴[] No.43500598[source]
I grew up in the UK, and have always used space, minus, space.

The first keyboard I used was my dad's typewriter, and I don't recall it having any 'dash' other that the minus sign.

replies(4): >>43501463 #>>43503229 #>>43503316 #>>43504777 #
3. KPGv2 ◴[] No.43501460[source]
> Some style guides recommend "space, en dash, space" for this

Which one does that? I threw up a little in my mouth and wish to avoid such style guides in the future!

replies(2): >>43501516 #>>43501534 #
4. KPGv2 ◴[] No.43501463[source]
space, minus, space is on the same level as manually typing two spaces after a period
replies(2): >>43501539 #>>43501579 #
5. mmooss ◴[] No.43501482[source]
The AP Style Manual, a/the leading source for US journalism at least, says

  <word> <space> <dash> <space> <word>
Outside of journalism, usually there is no padding, only,

  <word> <dash> <word>
I'm with you: For searches, the spaces make the words easier to parse. Those rules predate computers, I would guess.
replies(2): >>43501525 #>>43501561 #
6. mmooss ◴[] No.43501516[source]
https://news.ycombinator.com/item?id=43501482
7. lxgr ◴[] No.43501525[source]
> <word> <dash> <word>

That one I’d usually parse as a hyphen, as in e.g. well-known. “Word space dash space word” is much clearer, in my view.

> The AP Style Manual, a/the leading source for US journalism

One of the things I can easily get away with by not being a US journalist :)

replies(1): >>43502215 #
8. lxgr ◴[] No.43501534[source]
Better avoid British journalism then, and many other languages on top of that.

It’s very common outside of America, even in English.

9. lxgr ◴[] No.43501539{3}[source]
How so? One is the only way to approximate an en or em dash on a typewriter or in a charset that doesn’t have one, the other seems like a workaround of a typesetting bug at best.
replies(1): >>43503339 #
10. ◴[] No.43501556[source]
11. mattl ◴[] No.43501561[source]
Chicago Manual of Style has no spaces, so there’s some variation at least.
replies(1): >>43501741 #
12. rahimnathwani ◴[] No.43501579{3}[source]
Until ~10 years ago, I used to type two spaces after a period.
replies(2): >>43501836 #>>43507593 #
13. mmooss ◴[] No.43501741{3}[source]
CMOS is not journalism, so it's not variation from the GP?
replies(1): >>43501936 #
14. opello ◴[] No.43501772[source]
> Some style guides recommend "space, en dash, space" for this

The last paragraph of the article also addressed the subjective nature of spacing around the em dash:

> Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it.

As far as the selection detail, did you mean that you replace an em dash used like a comma or parenthesis with spaces and an en dash for specific highlight performance issues? Surely the spaces and an em dash would alleviate the selection highlight behavior and not muddy the waters of when to use an em vs. an en dash?

replies(1): >>43505664 #
15. Daneel_ ◴[] No.43501836{4}[source]
I still do, and I maintain that it’s easier to read text with double spaces after periods.
replies(2): >>43502768 #>>43504381 #
16. mattl ◴[] No.43501936{4}[source]
A wider number of people use either of them. Every place I’ve used used CMOS which I now use with others.
replies(1): >>43502338 #
17. stouset ◴[] No.43502215{3}[source]
It’s quite hard to mistake an em dash for a hyphen in a proportional font.

self-fulfilling

self—fulfilling

One of these looks very, very wrong.

replies(1): >>43502255 #
18. johnisgood ◴[] No.43502255{4}[source]
I agree, although I still prefer spaces between —.
19. ghaff ◴[] No.43502338{5}[source]
Company I used to work for used AP for things like press releases and, I think, official blog posts and Chicago plus a couple different tech style guides for everything else.

Basically, we didn’t like some things in AP but we wanted to make it easy for journalists to copy/paste.

20. _emacsomancer_ ◴[] No.43502768{5}[source]
TeX puts more space after periods/fullstops (which is why you're supposed to do special markup or other measures to mark '.' in the middle of sentences which aren't sentence-enders (e.g. like e.g.)). But it's generally smaller than the equivalent of two manual spaces.

(A nice thing in (La)TeX is that one could follow the "two spaces after a full-stop" rule, which then has the advantage of being an explicit marking for sentence boundaries (which your editor might be able to navigate; Emacs has a convention of assuming two spaces after a sentence-ending '.'), but then the TeX typesetting will take care of making it look right. I lost the habit of actually doing this, for better or worse, except when flycheck/checkdoc/package-linter.el makes me do it for docstrings.)

21. robin_reala ◴[] No.43503229[source]
en-US style is a single em-dash. en-GB style is a single en-dash with spaces on either side.
22. Propelloni ◴[] No.43503316[source]
I was under the impression that you do "-" for hyphen, "--" for En dash, and "---" for Em dash. IIRC, LaTeX (or maybe the editor, it has been some time) even helpfully changes that for you to the correct dash.
replies(2): >>43505543 #>>43505711 #
23. Propelloni ◴[] No.43503339{4}[source]
-, --, --- is, IIRC, how it is done in LaTex and would be exceedingly simple to do on a typewriter. That being said, to break up sentences I use " -- " because I think it looks nicer than "---". I'll go now ;)
replies(1): >>43504419 #
24. krick ◴[] No.43503947[source]
It's actually only your post that made me realize people don't normally put spaces around em dash. In French, Russian and a bunch of other languages proper typesetting is to use em dash as a standard dash character, and you always put spaces around them. So I did it in English as well, for many years now.

(I also now looked up and found out that in Spanish, apparently, you are supposed to put space only on one side of the dash, when used as a direct speech separator.)

replies(3): >>43505058 #>>43506008 #>>43508474 #
25. cyrillite ◴[] No.43503958[source]
I have been doing this for purely aesthetic reasons my whole life. Style guides be damned, I hate connected em dashes.
replies(1): >>43504460 #
26. globnomulous ◴[] No.43504381{5}[source]
I used to feel similarly. Now I find the double space a visual distraction that doesn't in any way improve readability.

The effect of the double space is, I suspect, a product of the reader's expectations: if you expect it, its absence creates mental work, detracting from readability; if you don't expect it, its presence is what creates mental work.

27. lxgr ◴[] No.43504419{5}[source]
LaTeX is a markup language though, not ASCII art. I can get behind two dashes as a substitute if no en dash is available, but three seems too much and looks like halfway to a horizontal line to me ;)
28. lxgr ◴[] No.43504460[source]
The good thing about style guides is that they’re guides, not laws :)

That’s one thing I really like about English: There’s no central authority decreeing what’s right and what’s wrong top down, and it feels like there is some room for individual preferences and experimentation.

Very refreshing, compared to e.g. German, which has more than one semi-official authority gate keeping “correctness” in speech and writing.

replies(1): >>43508941 #
29. Finnucane ◴[] No.43504777[source]
British typesetting style is a little different from US style in the way dashes are presented. In the UK, you might see a thin-space--en-dash---thin-space where a US typesetter would use a em-dash. Typewriter style generally follows books style. Since typesetters no longer use an extra space after punctuation, it's vestigial in typing.
30. rmunn ◴[] No.43505058[source]
I also put spaces around em dashes. It looks wrong—subtly wrong—to me to have the words glued together around the dash. It looks right — completely right — to me to have the dash standing on its own, as if it was a word in its own right.
replies(4): >>43505363 #>>43505552 #>>43509146 #>>43513256 #
31. lashloch ◴[] No.43505363{3}[source]
Funny—I'm the exact opposite. The extra spaces distract my eyes. To each their own! :)
replies(3): >>43505414 #>>43505425 #>>43509350 #
32. rmunn ◴[] No.43505414{4}[source]
To each their own: fully agreed, even though our tastes differ. I will mention one advantage of the spaces-around-dashes method: word wrap with default settings will break on the spaces around the dashes so that the entire word one, dash, word two combo doesn't end up pulled onto the next line as a whole unit. Whereas the advantage of the no-spaces method that you prefer is that word wrap will pull the entire word one, dash, word two combo onto the next line as a whole unit.

Why yes, I did list the opposite behavior as an advantage of each. Because that, too, is up to individual preference. :-)

replies(1): >>43506839 #
33. rmunn ◴[] No.43505425{4}[source]
P.S. I also prefer smileys with noses, :-), as opposed to the noseless smileys, :), that most people these days seem to prefer. :-)
34. rahimnathwani ◴[] No.43505543{3}[source]
Google Docs also does these replacements.
35. tines ◴[] No.43505552{3}[source]
The reason not to do this is observable in your post on my phone. The spaces cause the word wrapping algorithm to leave a dangling dash at the end of the line which looks ugly. Omitting spaces prevents the word break.
replies(6): >>43505675 #>>43505687 #>>43505892 #>>43505903 #>>43508537 #>>43509463 #
36. JadeNB ◴[] No.43505664[source]
> Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it.

It's funny that they omit to mention the possibility of setting it off with a thin space ' ' or hair space ' ' (those are the thin-space and hair-space Unicode characters, though they show up full width for me), which I thought was preferred typographic practice.

(On Googling, maybe the reason that they don't mention it is that I was imagining it; I can't find any evidence for my belief.)

replies(1): >>43506827 #
37. ◴[] No.43505675{4}[source]
38. hansvm ◴[] No.43505687{4}[source]
Funny, I'd rather have the break at the start or end of the emdash-implied break than just before or after it, not having to mentally handle some single dangling word divorced from its compatriots.
39. JadeNB ◴[] No.43505711{3}[source]
> I was under the impression that you do "-" for hyphen, "--" for En dash, and "---" for Em dash. IIRC, LaTeX (or maybe the editor, it has been some time) even helpfully changes that for you to the correct dash.

The conversion of '--' to an en dash and '---' to an em dash is done by the TeX compiler, and appears in the rendered file, but I think that most TeX editors don't change the TeX code itself. (This is distinct from XeTeX-based compilers, which can handle non-ASCII Unicode characters like the em dash '—' directly in the source.)

(I think that the article's point is that, in some fonts, -- (two hyphens) is literally the (approximate) size of an em dash, not that it is always understood as meaning an em dash. At least in my font, --- (three hyphens) is far too long to literally look like an em dash:

---

--

(in order, three hyphens, two hyphens, em dash, en dash).)

40. rmunn ◴[] No.43505892{4}[source]
I mentioned that as an advantage in one of my other comments. An advantage both ways, because it depends on preference. I have the same preference as hansvm: I would rather see the dangling dash at the end of the line, so I prefer putting spaces around the dashes. Having the entire word-dash-word structure move to the next line feels ugly to me. As with most things, de gustibus non est disputandum. (And also, quidquid Latine dictum sit altum videtur).
replies(1): >>43507004 #
41. da_chicken ◴[] No.43505903{4}[source]
Ironically, on my phone the only line that ends with an em dash has no spaces in it.

If you want to not have a line break, you shouldn't rely on arbitrary behavior. You should use non-breaking characters like non-breaking spaces and word joiners.

42. snozolli ◴[] No.43506008[source]
people don't normally put spaces around em dash

For what it's worth, I was in the last class in my high school to learn typing on IBM Selectric typewriters. We were taught to type two spaces, two hyphens, then two spaces. Incidentally, we were taught two spaces after periods and colons. To this day, I find it hard to read text that doesn't have proper spacing after periods. (HTML and WYSIWYG word processors handle formatting, but e.g. fixed-font text editors don't)

replies(2): >>43508518 #>>43508646 #
43. opello ◴[] No.43506827{3}[source]
> those are the thin-space and hair-space Unicode characters, though they show up full width for me

Interestingly, at least in my browser and grabbing the direct link to the comment with curl, show the bytes as 0x20 for both. Perhaps the comment submission handler, or even the browser, collated your more specific U+2009 (thin) and U+200A (hair) spaces into the regular U+0020 space?

replies(1): >>43507608 #
44. lxgr ◴[] No.43506839{5}[source]
That depends on the layout engine, I believe. Just tried it in Firefox (on macOS; not sure if it uses Core Text or something custom there), and it does sometimes break around the em dash in "foo—bar" style, not just "foo – bar" style.

I've definitely noticed the behavior you describe on some layout engines, too, and it's another reason why I personally prefer "foo – bar" style.

45. chipotle_coyote ◴[] No.43507004{5}[source]
It's the dangling dash at the beginning of the line that gets me. I see a lot of word break algorithms, including the one WebKit (and I suspect Blink) uses, which are happy to break "foo—bar" on either side of the em dash.
46. asveikau ◴[] No.43507593{4}[source]
I'm still doing it when I am typing at a physical keyboard. Hard habit to break. I learned it so long ago too.

You can tell when I've edited something on both a phone and a physical keyboard, based on the inconsistent use of spaces.

replies(1): >>43507659 #
47. JadeNB ◴[] No.43507608{4}[source]
> Interestingly, at least in my browser and grabbing the direct link to the comment with curl, show the bytes as 0x20 for both. Perhaps the comment submission handler, or even the browser, collated your more specific U+2009 (thin) and U+200A (hair) spaces into the regular U+0020 space?

Probably! I think HN strips out emoji; maybe it just takes the safest approach and strips out all non-white-listed Unicode.

48. rahimnathwani ◴[] No.43507659{5}[source]

  Hard habit to break. I learned it so long ago too.
Haha I learned to type organically, and it was only in my mid-40s that I retrained myself to type the correct way. It took something like 40 hours of practice on keybr.com before I could get close enough to my regular typing speed, such that I could switch over to the 'correct' method without it impacting my work.

Retraining myself to stop doing double-spaces took maybe a week.

replies(1): >>43508675 #
49. mmooss ◴[] No.43508474[source]
What is a "standard dash character"? There is no such thing in English; only hyphen, EN dash, EM dash (and some odds and ends).
50. dragonwriter ◴[] No.43508518{3}[source]
Its funny that people think that conventions for typewritten text built around the limitations of typewriters define what is “proper” in environments where typewriters and their limitations are not involved.
replies(2): >>43508695 #>>43510989 #
51. mmooss ◴[] No.43508537{4}[source]
> The reason not to do this is observable in your post on my phone. The spaces cause the word wrapping algorithm to leave a dangling dash at the end of the line which looks ugly. Omitting spaces prevents the word break.

That's an interesting practicality but I don't think it's the cause of the rule: The rule probably long predates automated line breaking. Also, I think automatic line breaking will break compound words at the hyphen; it doesn't require spaces (which is also obvious from a software development point of view: the logic is relatively simple either way):

  Lorem ipsum dolor sit amet, consectetur adipiscing double-
  decker lorem ipsum dolor sit amet, consectetur ...
52. kevin_thibedeau ◴[] No.43508646{3}[source]
I was taught that and abandoned it as a pointless anachronism. How often are you reading long form text in a monospace font?
replies(1): >>43510993 #
53. kevin_thibedeau ◴[] No.43508675{6}[source]
Most word processors can be configured to flag double spaces. That gives feedback to break the habit.
54. ovalanche ◴[] No.43508695{4}[source]
Yes, this always grinds my gears too. There is already a slightly larger space after periods in contemporary typefaces.

The old typewriter typefaces were monospaced, ie. every character was the same width, but this is no longer the case. Virtually all typefaces today are proportionally spaced, not monospaced. So it’s redundant to leave extra room after periods.

55. mmooss ◴[] No.43508941{3}[source]
In fairness, especially in the Anglo-Saxon dominated world post-WWII, English was under no threat to be swamped by German or French words.
56. laptopdev ◴[] No.43509146{3}[source]
Grammar nasi but isn't it "It looks right — completely right, to me — to have the dash standing on its own"...
57. mmooss ◴[] No.43509350{4}[source]
It's not your own. You write mostly for others to read.
58. lxgr ◴[] No.43509463{4}[source]
Preventing the word break doesn't seem very desirable, especially if it causes a large gap.
59. snozolli ◴[] No.43510989{4}[source]
What does this have to do with what I wrote? I said nothing of the sort. In fact, I explicitly pointed out that HTML and WYSIWYG word processors address it automatically.
replies(1): >>43653703 #
60. snozolli ◴[] No.43510993{4}[source]
Often enough, thanks.
61. hilbert42 ◴[] No.43513256{3}[source]
I've wondered about this for similar reasons. I usually omit the spaces but as I said in an earlier post I'll sometimes include them when I think the typography calls for it or when I want to add extra emphasis.

I've come to the conclusion it boils down to which style manual one follows. I've taken a careful look at numbers of high-end books which no doubt have been carefully typeset and I've found EM dashes with and without spaces.

It seems there is no definitive rule but I might be wrong.

62. ovalanche ◴[] No.43653703{5}[source]
That’s fair!

My comment applies to a few pedantics I know personally, who stubbornly double space after periods when typing in regular situations.