←back to thread

203 points dahlia | 1 comments | | HN request time: 0.2s | source
Show context
bsoles ◴[] No.45153983[source]
>> // This is a parser

>> const port = option("--port", integer());

I don't understand. Why is this a parser? Isn't it just way of enforcing a type in a language that doesn't have types?

I was expecting something like a state machine that takes the command line text and parses it to validate the syntax and values.

replies(1): >>45154702 #
1. hansvm ◴[] No.45154702[source]
The heavy lifting happens in the definitions of `option` and `integer`. Those will take in whatever arguments they take in and output some sort of `Stream -> Result<Tuple<T, Stream>>` function.

That might sound messy but to the author's point about parser combinators not being complicated, they really don't take much time to get used to, and they're quite simple if you wanted to build such a library yourself. There's not much code (and certainly no magic) going on under the hood.

The advantage of that parsing approach:

It's reasonably declarative. This seems like the author's core point. Parser-combinator code largely looks like just writing out the object you want as a parse result, using your favorite combinator library as the building blocks, and everything automagically works, with amazing type-checking if your language has such features.

The disadvantages:

1. Like any parsing approach, you have to actually consider all the nuances of what you really want parsed (e.g., conditional rules around whitespace handling). It looks a little to me (just from the blog post, not having examined the inner workings yet) like this project side-stepped that by working with the `Stream` type as just the `argv` list, allowing you to be able to say things like "parse the next blob as a string" without also having to encode whitespace and blob boundaries.

2. It's definitely slower (and more memory-intensive) than a hand-rolled parser, and usually also worse in that regard than other sorts of "auto-generated" parsing code.

For CLI arguments, especially if they picked argv as their base stream type, those disadvantages mostly don't exist. I could see it performing poorly for argv parsing for something like `cp` though (maybe not -- maybe something like `git cp`, which has more potential parse failures from delimiters like `--`?), which has both options and potentially ginormous lists of files; if you're not very careful in your argument specification then you might have exponential backtracking issues, and where that would be blatantly obvious in a hand-rolled parser it'll probably get swept under the rug with parser combinators.