Parse, don't validate (2019)

(lexi-lambda.github.io)

398 points declanhaigh | 1 comments | 07 Mar 23 08:47 UTC | HN request time: 0.207s | source

Show context

bruce343434 ◴[07 Mar 23 10:54 UTC] No.35053912[source]▶

Note that this basically requires your language to have ergonomic support for sum types, immutable "data classes", pattern matching.

The point is to parse the input into a structure which always upholds the predicates you care about so you don't end up continuously defensively programming in ifs and asserts.

replies(12): >>35054046 #>>35054070 #>>35054386 #>>35054514 #>>35054901 #>>35054993 #>>35055124 #>>35055230 #>>35056047 #>>35057866 #>>35058185 #>>35059271 #

crabbone ◴[07 Mar 23 12:18 UTC] No.35054514[source]▶

>>35053912 #

It's not just about these limitations.

In order to be useful, type systems need to be simple, but there's no such restrictions on rules that govern our expectations of data correctness.

OP is delusional if they think that their approach can be made practical. I mean, what if the expectation from the data that an value is a prime number? -- How are they going to encode this in their type systems? And this is just a trivial example.

There are plenty of useful constraints we routinely expect in message exchanges that aren't possible to implement using even very elaborate type systems. For example, if we want to ensure that all ids in XML nodes are unique. Or that the last digit of SSN is a checksum of the previous digits using some complex formula. I mean, every Web developer worth their salt knows that regular expressions are a bad idea for testing email addresses (which would be an example of parsing), and it's really preferable to validate emails by calling a number of predicates on them.

And, of course, these aren't the only examples: password validation (the annoying part that asks for capital letter, digit, special character? -- I want to see the author implement a parser to parse possible inputs to password field, while also giving helpful error messages s.a. "you forgot to use a digit"). Even though I don't doubt it's possible to do that, the resulting code would be an abomination compared to the code that does the usual stuff, i.e. just checks if a character is in a set of characters.

replies(10): >>35054557 #>>35054562 #>>35054640 #>>35054916 #>>35054920 #>>35055046 #>>35055734 #>>35055902 #>>35056302 #>>35057473 #

flupe ◴[07 Mar 23 12:26 UTC] No.35054557[source]▶

>>35054514 #

Both your examples (is my number prime, are my XML nodes unique) are easily expressed in a dependently-typed language.

Dependent type checkers may be hard to implement, but the typing rules are fairly simple, and people have been using this correct by construction philosophy using dependently-typed languages for a while now.

There's nothing delusional about that.

replies(1): >>35056718 #

1. crabbone ◴[07 Mar 23 15:51 UTC] No.35056718[source]▶

>>35054557 #

> dependently-typed language.

Now, are there really tools to make type systems with dependent types simple to prove? In reasonable time? How about the effort developers would have to put into just trying to express such types and into verifying that such an expression is indeed accomplishing its stated goals?

Just for a moment, imagine you filing a PR in a typical Web shop for the login form validation procedure, and sending a couple of screenfulls of Coq code or similar as a proof of your password validation procedure. How do you think your team will react to this?

Again, I didn't say it's impossible. Quite the opposite, I said that it is possible, but will be so bad in practice that nobody will want to use it, unless they are batshit crazy.

↑