←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 1 comments | | HN request time: 0.001s | source
Show context
arn3n ◴[] No.43639465[source]
“In other languages, it would be considered overkill to write a full parser when a simple regex can do the same thing. In Haskell, writing a parser is no big deal. We just do it and move on with our lives.”

I see a long code file filled with comments and long paragraph-level explanations. I think I’d rather just learn and use regex.

replies(5): >>43639538 #>>43639912 #>>43639965 #>>43641791 #>>43644069 #
layer8 ◴[] No.43639965[source]
Whenever I write a regex, I end up with a comments roughly ten times longer than the regex. That being said, regular expressions are often the right tool for the job (i.e. parsing a regular language, as opposed to a context-free language or whatever), just the syntax becomes unreadable rather quickly. I’m sure you could build a nicer regular-expression syntax in Haskell.
replies(3): >>43640776 #>>43640864 #>>43642797 #
1. f1shy ◴[] No.43640864[source]
Yes. Regex tend to become rather fast write only. One solution is commenting, but is still complex. What I like to do now (in C) is define parts of it. Just a crude example to get the idea:

   // STRing: matches anything inside quotes (single or double)
   #define STR "[\"'](.*)[\"']"
   // NUMber: matches decimal or hexadecimal numbers
   #define NUM "([[:digit:]]x?[[:xdigit:]]*)"
   
   regcomp(&reg_exp, STR NUM , REG_EXTENDED | REG_ICASE);
So at the end I compose the RE with the various parts, which are documented separately.