←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 1 comments | | HN request time: 0s | source
Show context
DadBase ◴[] No.43639894[source]
Parser combinators are great until you need to parse something real, like CSV with embedded newlines and Excel quotes. That’s when you reach for the reliable trio: awk, duct tape, and prayer.
replies(2): >>43640949 #>>43640977 #
iamevn ◴[] No.43640977[source]
I don't follow why parser combinators would be a bad tool for CSV. It seems like one would specify a CSV parser as (pardon the pseudocode):

  separator = ','
  quote = '"'
  quoted_quote = '""'
  newline = '\n'
  plain_field = sequence(char_except(either(separator, quote, newline)))
  quoted_field = quote + sequence(either(char_except(quote), quoted_quote)) + quote 
  field = either(quoted_field, plain_field)
  row = sequence_with_separator(field, separator)
  csv = sequence_with_separator(row, newline)
Seems fairly natural to me, although I'll readily admit I haven't had to write a CSV parser before so I'm surely glossing over some detail.
replies(2): >>43641113 #>>43643933 #
kqr ◴[] No.43641113[source]
I think GP was sarcastic. We have these great technologies available but people end up using duct tape and hope anyway.
replies(1): >>43643935 #
1. DadBase ◴[] No.43643935[source]
Exactly. At some point every parser combinator turns into a three-line awk script that runs perfectly as long as the moon is waning and the file isn’t saved from Excel for Mac.