←back to thread

698 points jgrahamc | 1 comments | | HN request time: 0.346s | source
Show context
hn_throwaway_99 ◴[] No.20425667[source]
One thing that was interesting to me:

The outage was caused by a regex that ended up doing a lot of backtracking, which caused PCRE, the regex engine, to essentially handle a runaway expression.

This reminded me of a HN post from a couple months back by the author of Google Code Search, and how it worked: https://swtch.com/~rsc/regexp/regexp4.html . Interestingly, he wrote his own regex engine, RE2, specifically because PCRE and others did not use real automata and he needed a way to do arbitrary regex search safely.

replies(4): >>20425803 #>>20426118 #>>20426331 #>>20426651 #
1. emmelaich ◴[] No.20426331[source]
I think it's not uncommon. I've seen it in two places recently.

1. A test job in CI/CD pipeline suddenly taking a very long time and lots of cpu

2. A data cleansing / checking job in a Java webapp occasionally turning the machine to molasses.

In both occurrences the regex had been around for a while; what happened is that the data was different. e.g. Lots of trailing whitespace.