←back to thread

262 points el3ctron | 1 comments | | HN request time: 0s | source
Show context
billconan ◴[] No.46174647[source]
I don't think HTML is the right approach. HTML is better than PDF, but it is still a format for displaying/rendering.

the actual paper content format should be separated from its rendering.

i.e. it should contain abstract, sections, equations, figures, citations etc. but it shouldn't have font sizes, layout etc.

the viewer platforms then should be able to style the content differently.

replies(5): >>46174655 #>>46174732 #>>46174842 #>>46175075 #>>46175479 #
cluckindan ◴[] No.46175075[source]
HTML alone is in fact not a format for displaying/rendering. Done properly, it is a structural representation of the content. (This is often called ”semantic HTML”.)

They are converting to HTML to make the content more accessible. Accessibility in this context means a11y, in effect ”more accessible” equates to ”more compatible with screen readers”.

While PDF documents can be made accessible, it is way easier to do it in HTML, where browsers build an actual AOM (accessibility object model) tree and expose it to screen readers.

>it should contain abstract, sections, equations, figures, citations etc.

So <article>, <section>, <math>, <figure>, <cite>, etc.

replies(3): >>46175106 #>>46176345 #>>46179096 #
o11c ◴[] No.46179096[source]
The hope for semantic HTML died the day they said "stop using <i>, use <em>", regardless of what the actual purpose of the italics was (it's usually not emphasis).
replies(1): >>46183844 #
1. cluckindan ◴[] No.46183844[source]
Who said that? The semantics are different.

The <i> HTML element represents a range of text that is set off from the normal text for some reason, such as idiomatic text, technical terms, taxonomical designations, among others. Historically, these have been presented using italicized type, which is the original source of the <i> naming of this element.

The <em> element is for words that have a stressed emphasis compared to surrounding text, which is often limited to a word or words of a sentence and affects the meaning of the sentence itself.

Typically this element is displayed in italic type. However, it should not be used to apply italic styling; use the CSS font-style property for that purpose. Use the <cite> element to mark the title of a work (book, play, song, etc.). Use the <i> element to mark text that is in an alternate tone or mood, which covers many common situations for italics such as scientific names or words in other languages.