←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 1 comments | | HN request time: 0.001s | source
Show context
o11c ◴[] No.43639374[source]
Note that most implementations of both parser combinators and regexes can fail very badly (exponential time). Never use either on untrusted input, unless you can prove your implementation lets you stay linear.
replies(4): >>43639552 #>>43639845 #>>43640240 #>>43641298 #
giraffe_lady ◴[] No.43639552[source]
PEGs don't have this problem right? Linear time over length of input? Though I guess most peg libraries probably aren't "pure" pegs to get around its limitations and may reintroduce this issue?
replies(3): >>43639629 #>>43639698 #>>43650782 #
1. o11c ◴[] No.43639698[source]
No, PEG is prone to exactly this in its default form. Parser combinator usually means something PEG-ish anyway.

If you memoize (packrat), it improves to polynomial time (not sure what, but it's not linear; that's false advertising and a fraudulent claim), but you're still stuck with the vulnerability to bugs in the grammar.

A better idea is to create a parser-combinator-style API on top of an LL or LR backend, for guaranteed linear time and only enough stack space for the deepest part of the AST.