Parser combinators are great until you need to parse something real, like CSV with embedded newlines and Excel quotes. That’s when you reach for the reliable trio: awk, duct tape, and prayer.
Parser combinators are great until you need to parse something real, like CSV with embedded newlines and Excel quotes. That’s when you reach for the reliable trio: awk, duct tape, and prayer.
I don't follow why parser combinators would be a bad tool for CSV. It seems like one would specify a CSV parser as (pardon the pseudocode):
separator = ','
quote = '"'
quoted_quote = '""'
newline = '\n'
plain_field = sequence(char_except(either(separator, quote, newline)))
quoted_field = quote + sequence(either(char_except(quote), quoted_quote)) + quote
field = either(quoted_field, plain_field)
row = sequence_with_separator(field, separator)
csv = sequence_with_separator(row, newline)
Seems fairly natural to me, although I'll readily admit I haven't had to write a CSV parser before so I'm surely glossing over some detail.