←back to thread

On Building Git for Lawyers

(jordanbryan.substack.com)
162 points jpbryan | 1 comments | | HN request time: 0.213s | source
Show context
sbpayne ◴[] No.42137577[source]
It's curious to me how many people think "just convert it to a different filetype" will solve the problem.

Do you think there are other professions/industries that would benefit from this?

replies(4): >>42137631 #>>42137724 #>>42138000 #>>42138184 #
1. noirscape ◴[] No.42138000[source]
The main reason people think that is because most of the MS Office suite's document formats are things that no programmer wants to touch if they can help it and usually when you're parsing them to something else, you can drop all the weirdness. Nobody loses sleep because the xlsx file you're using as an input for a script doesn't parse the excel graph that someone else put into it properly.

They're all incredibly capable formats (from a user perspective anyway), with the caveat that they're utter hell to work with in terms of a programming perspective. It's easier to just toss it into a black box parser/serializer and hope that all the text you need in/out comes out properly on the other side.

Actually generating docx or xlsx files (that aren't trivial) that look exactly like another input file (so you have to account for every notable difference in formatting) is a ton of work; most people who have touched webdev will probably at some point have had to format their emails for Outlooks half-assed ancient HTML parser and even there, you at least control what it's going to look like.