Parser Combinators Beat Regexes

(entropicthoughts.com)

122 points mooreds | 1 comments | 09 Apr 25 21:53 UTC | HN request time: 0.249s | source

Show context

arn3n ◴[10 Apr 25 00:34 UTC] No.43639465[source]▶

“In other languages, it would be considered overkill to write a full parser when a simple regex can do the same thing. In Haskell, writing a parser is no big deal. We just do it and move on with our lives.”

I see a long code file filled with comments and long paragraph-level explanations. I think I’d rather just learn and use regex.

replies(5): >>43639538 #>>43639912 #>>43639965 #>>43641791 #>>43644069 #

layer8 ◴[10 Apr 25 02:07 UTC] No.43639965[source]▶

>>43639465 #

Whenever I write a regex, I end up with a comments roughly ten times longer than the regex. That being said, regular expressions are often the right tool for the job (i.e. parsing a regular language, as opposed to a context-free language or whatever), just the syntax becomes unreadable rather quickly. I’m sure you could build a nicer regular-expression syntax in Haskell.

replies(3): >>43640776 #>>43640864 #>>43642797 #

1. f1shy ◴[10 Apr 25 05:16 UTC] No.43640864[source]▶

>>43639965 #

Yes. Regex tend to become rather fast write only. One solution is commenting, but is still complex. What I like to do now (in C) is define parts of it. Just a crude example to get the idea:

   // STRing: matches anything inside quotes (single or double)
   #define STR "[\"'](.*)[\"']"
   // NUMber: matches decimal or hexadecimal numbers
   #define NUM "([[:digit:]]x?[[:xdigit:]]*)"
   
   regcomp(&reg_exp, STR NUM , REG_EXTENDED | REG_ICASE);

So at the end I compose the RE with the various parts, which are documented separately.

↑