←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 2 comments | | HN request time: 0s | source
Show context
yen223 ◴[] No.43639551[source]
Parser combinators are one of those ideas that is really powerful in practice, but will never be mainstream because it had the bad fortune of coming from the functional programming world. It's a shame too, because parser combinators are what made parsers click for me.

Take the hard problem of parsing a language, break it down into the much simpler problem of reading tokens, then write simple "combinators" (and, or, one_or_many, etc etc) to combine these simple parsers into larger more complex parsers.

You could almost build an entire parser combinator library from scratch from memory, because none of the individual components are complicated.

replies(13): >>43639775 #>>43639805 #>>43639834 #>>43640597 #>>43641009 #>>43641205 #>>43641459 #>>43641675 #>>43642100 #>>43642148 #>>43643853 #>>43644151 #>>43650405 #
thaumasiotes ◴[] No.43639775[source]
> Take the hard problem of parsing a language, break it down into the much simpler problem of reading tokens, then write simple "combinators" (and, or, one_or_many, etc etc) to combine these simple parsers into larger more complex parsers.

That's a terrible example for "parser combinators beat regexes"; those are the three regular operations. (Except that you use zero_or_many in preference to one_or_many.)

replies(1): >>43641710 #
1. mrkeen ◴[] No.43641710[source]
Can I see the (implied) counter-example?

A regex which turns a language's source code into an AST?

replies(1): >>43642623 #
2. thaumasiotes ◴[] No.43642623[source]
What implied counterexample? Regular expressions define the simplest language type a CS education generally covers. You would expect anything to be more capable.

But you wouldn't prove it by demonstrating that you can recognize regular languages. That's the thing regular expressions are good at!