←back to thread

Parse, don't validate (2019)

(lexi-lambda.github.io)
398 points declanhaigh | 5 comments | | HN request time: 0.891s | source
Show context
bruce343434 ◴[] No.35053912[source]
Note that this basically requires your language to have ergonomic support for sum types, immutable "data classes", pattern matching.

The point is to parse the input into a structure which always upholds the predicates you care about so you don't end up continuously defensively programming in ifs and asserts.

replies(12): >>35054046 #>>35054070 #>>35054386 #>>35054514 #>>35054901 #>>35054993 #>>35055124 #>>35055230 #>>35056047 #>>35057866 #>>35058185 #>>35059271 #
crabbone ◴[] No.35054514[source]
It's not just about these limitations.

In order to be useful, type systems need to be simple, but there's no such restrictions on rules that govern our expectations of data correctness.

OP is delusional if they think that their approach can be made practical. I mean, what if the expectation from the data that an value is a prime number? -- How are they going to encode this in their type systems? And this is just a trivial example.

There are plenty of useful constraints we routinely expect in message exchanges that aren't possible to implement using even very elaborate type systems. For example, if we want to ensure that all ids in XML nodes are unique. Or that the last digit of SSN is a checksum of the previous digits using some complex formula. I mean, every Web developer worth their salt knows that regular expressions are a bad idea for testing email addresses (which would be an example of parsing), and it's really preferable to validate emails by calling a number of predicates on them.

And, of course, these aren't the only examples: password validation (the annoying part that asks for capital letter, digit, special character? -- I want to see the author implement a parser to parse possible inputs to password field, while also giving helpful error messages s.a. "you forgot to use a digit"). Even though I don't doubt it's possible to do that, the resulting code would be an abomination compared to the code that does the usual stuff, i.e. just checks if a character is in a set of characters.

replies(10): >>35054557 #>>35054562 #>>35054640 #>>35054916 #>>35054920 #>>35055046 #>>35055734 #>>35055902 #>>35056302 #>>35057473 #
ollysb ◴[] No.35054562[source]
You can use opaque types to encode constraints that the type system isn't able to express. That way you can have factory functions that apply any logic that's required before allowing construction of the opaque type. Now whenever that opaque type is referred to there's a guarantee that the data it contains satisfies your desired constraint.
replies(1): >>35056632 #
1. crabbone ◴[] No.35056632[source]
> You can use opaque types to encode constraints that the type system isn't able to express.

You just admitted in this sentence that the use of opaque types achieves nothing of value. Which was my point all along: why use them if they are useless? Just to feel smart because I pulled out an academia-flavored ninety-pound dictionary word to describe it?

replies(1): >>35058732 #
2. chowells ◴[] No.35058732[source]
Opaque types absolutely provide something of value. They're different types. You can't pass an Integer to a function that requires a PrimeNumber. It's a compile error.
replies(1): >>35060355 #
3. crabbone ◴[] No.35060355[source]
Not in this context they don't. They are useless if you want to ensure that a given number is a prime number.
replies(2): >>35063518 #>>35063646 #
4. ParetoOptimal ◴[] No.35063518{3}[source]
> Not in this context they don't.

What context is it exactly where they don't matter?

I can tell you in practice, in the real world, they very much do.

> They are useless if you want to ensure that a given number is a prime number.

It's not useless. The point is that once you have type `PrimeNumber` that can only be constructed after being validated, you then can write functions exist in a reality where only PrimeNumber exists.

5. ParetoOptimal ◴[] No.35063646{3}[source]
I wrote an example with prime numbers that you can run in the Haskell playground:

https://play.haskell.org/saved/gRsNcCGo

> They are useless if you want to ensure that a given number is a prime number.

This is wrong. In the example above `addPrimes` will only take prime numbers.

As such if I make a Jira story that says "add multiply/subtract functions using the PrimeNumber type" I'll know that implementation is simplified by only being able to concern itself with prime numbers.