←back to thread

Regex Isn't Hard (2023)

(timkellogg.me)
75 points asicsp | 3 comments | | HN request time: 1.046s | source
Show context
latexr ◴[] No.43750808[source]
I’m a fan of regular expressions, though I understand why many people wince at the sight. You should avoid showing them to a non-programmer who is interested in learning to code, because they’ll immediately fear programming is intractable.

Even as much as I like regex, I wouldn’t recommend this post. One reason is the code style is too close to regular text:

> a matches a single character, always lowercase a.

That sentence uses “a” three times, two of them as code and once as an indefinite article, but it’s not immediately obvious to eye. VoiceOver completely fumbles it, especially considering the sentence immediately after.

A more important reason against recommending the article is that I find a bunch of the arguments to be unhelpful. If you’re trying to convince people to give regular expressions a chance, telling them to ignore `.` and use `[^%]` is going to bite them. That’s not super common (important when trying to learn more from other sources) and even an experienced regexer must do a double take to figure out “is there a reason this specific character must not be matched?” Furthermore, no new learner is going to remember that four character incantation, and neither are they going to understand what’s happening when their code doesn’t work because there was a `%` in their text. People need to learn about `.` (possibly the most common character in regex) if only because they also need to learn to escape it and not ignore it when there is a literal period in the text. Don’t tell people to ignore repetition ranges either, they aren’t difficult to reason about and are certainly simpler to read than the same blob of intractable text multiple times.

replies(1): >>43751104 #
1. LaputanMachine ◴[] No.43751104[source]
I've also seen people use `[\s\S]` to match all characters when they couldn't use `.`.
replies(2): >>43752536 #>>43752606 #
2. tomsmeding ◴[] No.43752536[source]
This is a common approach when the regex needs to match any character including newlines; `.` often doesn't.
3. dimava ◴[] No.43752606[source]
I generally use `[^]`

Also you can use . with the dotAll /s