(entropicthoughts.com)

122 points mooreds | 3 comments | 09 Apr 25 21:53 UTC | HN request time: 0.824s | source

Show context

o11c ◴[10 Apr 25 00:17 UTC] No.43639374[source]▶

Note that most implementations of both parser combinators and regexes can fail very badly (exponential time). Never use either on untrusted input, unless you can prove your implementation lets you stay linear.

replies(4): >>43639552 #>>43639845 #>>43640240 #>>43641298 #

thaumasiotes ◴[10 Apr 25 01:46 UTC] No.43639845[source]▶

>>43639374 #

Only PCREs are exponential time, in service of a feature you basically never need. Regexes are always linear time.

They can take exponential space, though, so I'm not sure why knowing you'll be able to process the data in linear time is supposed to keep you safe.

replies(4): >>43639936 #>>43639940 #>>43640180 #>>43640937 #

1. dullcrisp ◴[10 Apr 25 02:02 UTC] No.43639936[source]▶

>>43639845 #

I don’t believe space > time is possible on a Turing machine?

replies(1): >>43639958 #

2. pjscott ◴[10 Apr 25 02:05 UTC] No.43639958[source]▶

>>43639936 (TP) #

You’re right, of course, but there was a minor miscommunication: the exponential space is exponentially proportional to the size of the regular expression, and the linear time is linearly proportional to the length of the string being searched through.

replies(1): >>43639996 #

3. dullcrisp ◴[10 Apr 25 02:13 UTC] No.43639996[source]▶

>>43639958 #

Thanks! Though I imagine in most cases the regular expression itself would be a fixed part of the codebase and not given as input.

↑

Parser Combinators Beat Regexes