←back to thread

Parse, don't validate (2019)

(lexi-lambda.github.io)
398 points declanhaigh | 1 comments | | HN request time: 0.216s | source
Show context
Octokiddie ◴[] No.35055969[source]
I like how the author boils the idea down into a simple comparison between two alternative approaches to a simple task: getting the first element of a list. Two alternatives are presented: parseNonEmpty and validateNonEmpty. From the article:

> The difference lies entirely in the return type: validateNonEmpty always returns (), the type that contains no information, but parseNonEmpty returns NonEmpty a, a refinement of the input type that preserves the knowledge gained in the type system. Both of these functions check the same thing, but parseNonEmpty gives the caller access to the information it learned, while validateNonEmpty just throws it away.

This might not seem like much of a distinction, but it has far-reaching implications downstream:

> These two functions elegantly illustrate two different perspectives on the role of a static type system: validateNonEmpty obeys the typechecker well enough, but only parseNonEmpty takes full advantage of it. If you see why parseNonEmpty is preferable, you understand what I mean by the mantra “parse, don’t validate.”

parseNonEmpty is better because after a caller gets a NonEmpty it never has to check the boundary condition of empty again. The first element will always be available, and this is enforced by the compiler. Not only that, but functions the caller later calls never need to worry about the boundary condition, either.

The entire concern over the first element of an empty list (and handling the runtime errors that result from failure to meet the boundary condition) disappear as a developer concern.

replies(3): >>35056624 #>>35056955 #>>35058253 #
waynesonfire ◴[] No.35056624[source]
Does this scale? What if you have 10 other conditions?
replies(6): >>35056897 #>>35056901 #>>35057510 #>>35057732 #>>35059327 #>>35061385 #
1. cdaringe ◴[] No.35061385[source]
Somebody somewhere in system will be checking. At the boundary likely scales the best because you dont burn cycles that otherwise would be distributed and likely redundant in system. If you have a massive model, its feasible that it makes sense to defer partial/subparsing?

Parser combinators seem to be pretty rippin fast for the most part, at least those ive used in ocaml and rust.