←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 1 comments | | HN request time: 0s | source
Show context
o11c ◴[] No.43639374[source]
Note that most implementations of both parser combinators and regexes can fail very badly (exponential time). Never use either on untrusted input, unless you can prove your implementation lets you stay linear.
replies(4): >>43639552 #>>43639845 #>>43640240 #>>43641298 #
internet_points ◴[] No.43641298[source]
This is one of the reasons I've been afraid to use parser combinators too heavily. With regular (finite-state) languages I know their time usage, with parser combinators I have no clue, and there are so many different libraries and implementations and some are monadic and some are applicative and few mention worst-case. There are benchmarks https://gitlab.com/FinnBender/haskell-parsing-benchmarks#byt... but what kinds of "dangerous operations" do I have to watch out for when writing with parser combinators?

(I guess part of the problem is just that regular languages have been studied since Kleene had a full head of hair, while parser combinators were more or less unknown until the 80's.)

replies(2): >>43641727 #>>43643059 #
1. wyager ◴[] No.43643059[source]
The evaluation model of most of these parser combinator libraries is simple enough that you can just think directly about the call tree. It's difficult to get "surprised" like you can with PCREs. E.g. (x+x+)+y as a regex may blow up, but "many1 (many1 x >> many1 x) >> y" will just evaluate forwards a single time, greedily, and fail, at least with any haskell parser combinator library I can think of.