←back to thread

Regex Isn't Hard (2023)

(timkellogg.me)
75 points asicsp | 1 comments | | HN request time: 0.205s | source
Show context
gwd ◴[] No.43750572[source]
So my brother doesn't code for a living, but has done a fair amount of personal coding, and also gotten into the habit of watching live-coding sessions on YouTube. Recently he's gotten involved in my project a bit, and so we've done some pair programming sessions, in part to get him up to speed on the codebase, in part to get him up to speed on more industrial-grade coding practices and workflows.

At some point we needed to do some parsing of some strings, and I suggested a simple regex. But apparently a bunch of the streamers he's been watching basically have this attitude that regexes stink, and you should use basically anything else. So we had a conversation, and compared the clarity of coding up the relatively simple regex I'd made, with how you'd have to do it procedurally; I think the regex was a clear winner.

Obviously regexes aren't the right tool for every job, and they can certainly be done poorly; but in the right place at the right time they're the simplest, most robust, easiest to understand solution to the problem.

replies(1): >>43750627 #
kelafoja ◴[] No.43750627[source]
My problem is that regexes are write-only, unreadable once written (to me anyway). And sometimes they do more than you intended. You maybe tested on a few inputs and declared it fit for purpose, but there might be more inputs upon which it has unintended effects. I don't mind simple, straight-forward regexes. But when they become more complex, I tend to prefer to write out the procedural code, even if it is (much) longer in terms of lines. I find that generally I can read code better than regexes, and that code I write is more predictable than regexes I write.
replies(6): >>43750642 #>>43750826 #>>43751127 #>>43751152 #>>43751569 #>>43751927 #
bazoom42 ◴[] No.43751127[source]
> I tend to prefer to write out the procedural code, even if it is (much) longer in terms of lines.

This might work for you, but in general the amount of bugs is proportional to the amount of code. The regex engine is alredy throughly tested by someone else while a custom implementation in procedural code will probably have bugs and be a lot more work to maintain if the pattern changes.

replies(3): >>43751445 #>>43753974 #>>43765539 #
kelafoja ◴[] No.43765539[source]
That is quite a generalization. The regex engine is tested, but my specific regular expression isn't. My ability to write correct regular expressions is weak, so there can be many bugs in the one line of regular expession.
replies(2): >>43822087 #>>43822524 #
1. ◴[] No.43822087[source]