←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 3 comments | | HN request time: 0.001s | source
Show context
o11c ◴[] No.43639374[source]
Note that most implementations of both parser combinators and regexes can fail very badly (exponential time). Never use either on untrusted input, unless you can prove your implementation lets you stay linear.
replies(4): >>43639552 #>>43639845 #>>43640240 #>>43641298 #
thaumasiotes ◴[] No.43639845[source]
Only PCREs are exponential time, in service of a feature you basically never need. Regexes are always linear time.

They can take exponential space, though, so I'm not sure why knowing you'll be able to process the data in linear time is supposed to keep you safe.

replies(4): >>43639936 #>>43639940 #>>43640180 #>>43640937 #
1. dullcrisp ◴[] No.43639936[source]
I don’t believe space > time is possible on a Turing machine?
replies(1): >>43639958 #
2. pjscott ◴[] No.43639958[source]
You’re right, of course, but there was a minor miscommunication: the exponential space is exponentially proportional to the size of the regular expression, and the linear time is linearly proportional to the length of the string being searched through.
replies(1): >>43639996 #
3. dullcrisp ◴[] No.43639996[source]
Thanks! Though I imagine in most cases the regular expression itself would be a fixed part of the codebase and not given as input.