←back to thread

Parser Combinators Beat Regexes

(entropicthoughts.com)
120 points mooreds | 5 comments | | HN request time: 0.208s | source
Show context
yen223 ◴[] No.43639551[source]
Parser combinators are one of those ideas that is really powerful in practice, but will never be mainstream because it had the bad fortune of coming from the functional programming world. It's a shame too, because parser combinators are what made parsers click for me.

Take the hard problem of parsing a language, break it down into the much simpler problem of reading tokens, then write simple "combinators" (and, or, one_or_many, etc etc) to combine these simple parsers into larger more complex parsers.

You could almost build an entire parser combinator library from scratch from memory, because none of the individual components are complicated.

replies(13): >>43639775 #>>43639805 #>>43639834 #>>43640597 #>>43641009 #>>43641205 #>>43641459 #>>43641675 #>>43642100 #>>43642148 #>>43643853 #>>43644151 #>>43650405 #
egonschiele ◴[] No.43639805[source]
Hey! I love parser combinators and wish they were more mainstream. That's why I wrote Tarsec, which is Parsec for TypeScript: https://github.com/egonSchiele/tarsec

I have a short and a long intro, both illustrated: https://github.com/egonSchiele/tarsec/blob/main/tutorials/5-... and https://github.com/egonSchiele/tarsec/blob/main/tutorials/th...

It also has a neat debug mode, and a way to benchmark performance. I've loved using it, hopefully someone reading will use it too! Use this as a chance to learn something cool (parser combinators)!

replies(2): >>43639911 #>>43640797 #
1. furyofantares ◴[] No.43640797[source]
These are great.

I accidentally invented parser combinators for a project once. I was quite fond of the approach but it ended up being pretty nearly unusable for anyone else on the team. This was problematic because the project was such that we were always adding new things to be parsed. The whole reason I took the approach was that I wanted people to be able to freely add parsers as new requirements came in.

It being my first shot at it, it was for sure not as usable as it could be. But looking at your stuff, which is delightful, I am filled with doubt that my goal was even possible. Your stuff looks about as clean as possible, and yet seems likely to be almost as impenetrable to someone not versed in parsing.

That said, I am gonna look for a chance to use this in appropriate side projects.

replies(2): >>43641927 #>>43645093 #
2. miningape ◴[] No.43641927[source]
Yeah parsers can be tricky "culturally" - as you've found there are many devs who haven't written a recursive descent parser and therefore they lack any meaningful context to understand what's happening under the hood.

It's a similar scenario to if you made your own RegEx and none of the devs on your team know's what it is. It just takes too long to teach them that on the job since there's very few quality resources you can point them towards - so you're stuck teaching everything.

On the other hand, if you have a team full of experienced PL developers doing this could significantly improve their productivity.

3. egonschiele ◴[] No.43645093[source]
The idea of functions returning functions makes a lot of people's heads spin! I've struggled to get people using this library too, but I use it in personal work and it is an absolute joy. One of my favorite things I've made.
replies(1): >>43645573 #
4. TuringTest ◴[] No.43645573[source]
You can typically trick them into thinking that you're returning an object with a single method ;-)
replies(1): >>43660500 #
5. yen223 ◴[] No.43660500{3}[source]
Getting Java flashbacks from this