←back to thread

Parse, don't validate (2019)

(lexi-lambda.github.io)
398 points declanhaigh | 2 comments | | HN request time: 0s | source
Show context
Octokiddie ◴[] No.35055969[source]
I like how the author boils the idea down into a simple comparison between two alternative approaches to a simple task: getting the first element of a list. Two alternatives are presented: parseNonEmpty and validateNonEmpty. From the article:

> The difference lies entirely in the return type: validateNonEmpty always returns (), the type that contains no information, but parseNonEmpty returns NonEmpty a, a refinement of the input type that preserves the knowledge gained in the type system. Both of these functions check the same thing, but parseNonEmpty gives the caller access to the information it learned, while validateNonEmpty just throws it away.

This might not seem like much of a distinction, but it has far-reaching implications downstream:

> These two functions elegantly illustrate two different perspectives on the role of a static type system: validateNonEmpty obeys the typechecker well enough, but only parseNonEmpty takes full advantage of it. If you see why parseNonEmpty is preferable, you understand what I mean by the mantra “parse, don’t validate.”

parseNonEmpty is better because after a caller gets a NonEmpty it never has to check the boundary condition of empty again. The first element will always be available, and this is enforced by the compiler. Not only that, but functions the caller later calls never need to worry about the boundary condition, either.

The entire concern over the first element of an empty list (and handling the runtime errors that result from failure to meet the boundary condition) disappear as a developer concern.

replies(3): >>35056624 #>>35056955 #>>35058253 #
waynesonfire ◴[] No.35056624[source]
Does this scale? What if you have 10 other conditions?
replies(6): >>35056897 #>>35056901 #>>35057510 #>>35057732 #>>35059327 #>>35061385 #
dangets ◴[] No.35056901[source]
I would think stacking 10 generic conditions wouldn't scale if you are trying to mix and match arbitrarily. If you are trying to mix NonEmpty and AllEven and AllGreaterThan100 for the List example, then you would get the combinatorial explosion of types.

In practice, I find it is usually something like 'UnvalidatedCustomerInfo' being parsed into a 'CustomerInfo' object, where you can validate all of the fields at the same time (phone number, age, non-null name etc.). Once you have parsed it into the internal 'CustomerInfo' type - you don't need to keep re-validating the expected invariants everywhere this type is used. There was a good video on this that I wish I could find again, where the presenter gave the example of using String for a telephoneNumber field instead of a dedicated type. Any code that used the telephoneNumber field would have to re-parse and validate it "to be safe" it was in the expected format.

The topic of having untrusted external types and internal trusted types is also explained in the book `Domain Modeling Made Functional` which I highly recommend.

replies(4): >>35057468 #>>35058092 #>>35059831 #>>35063117 #
1. hansvm ◴[] No.35057468{3}[source]
In Zig at least, the types-as-values framework means it'd be pretty easy to support arbitrary mixing and matching. If nobody beats me to it, I'll make a library to remove the boilerplate for that idea this weekend.
replies(1): >>35105217 #
2. hansvm ◴[] No.35105217[source]
https://github.com/hmusgrave/pdv