←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 1 comments | | HN request time: 0s | source
Show context
austin-cheney ◴[] No.43639341[source]
There are numerous posts, comments, articles and so forth about when to use regex versus using a parser. The general rule is this:

If you need to do a search from a string, such as needle(s) from a hat stack, regex is probably more ideal than a parser. If you need anything more intelligent than a list of search results you probably want a full formal parser.

Most languages allow a form of nested regex that allow for increased search precision. This occurs when a method that makes use of a regex returns to a function whose argument is a matching string result, which is why regex is probably enough when the business is primitive. There is a tremendously higher cost to using a full parser, considering the lexer and tokenizer plus rules, but it’s so much more intelligent that it’s not even comparable.

replies(2): >>43639533 #>>43641180 #
1. kleiba ◴[] No.43641180[source]
Of course you could also pull out the good old Chomsky hierarchy and make an argument against regexes based on whatever the nature of your data is.

But the thing is: the beauty of regexes lies in their compactness (which, in turn, can make them quite difficult to debug). So, of course, if you want to optimize for some other measure, you'd use an alternative approach (e.g. a parser). And there are a number of worthwhile measures, such as e.g. the already mentioned debuggability, appropriateness for the problem at hand in terms of complexity, processing speed, ease of using the match result, the ability to get multiple alternative matches, support of regexes in your language of choice, etc.

But simply stating "X beats regexes" without saying in what respect leaves something to be desired.