←back to thread

Parse, don't validate (2019)

(lexi-lambda.github.io)
398 points declanhaigh | 4 comments | | HN request time: 0s | source
Show context
jameshart ◴[] No.35055031[source]
This post absolutely captures an essential truth of good programming.

Unfortunately, it conceals it behind some examples that - while they do a good job of illustrating the generality of its applicability - don’t show as well how to use this in your own code.

Most developers are not writing their own head method on the list primitive - they are trying to build a type that encapsulates a meaningful entity in their own domain. And, let’s be honest, most developers are also not using Haskell.

As a result I have not found this article a good one to share with junior developers to help them understand how to design types to capture the notion of validity, and to replace validation with narrowing type conversions (which amount to ‘parsing’ when the original type is something very loose like a string, a JSON blob, or a dictionary).

Even though absolutely those practices follow from what is described here.

Does anyone know of a good resource that better anchors these concepts in practical examples?

replies(3): >>35056114 #>>35058281 #>>35059886 #
epolanski ◴[] No.35058281[source]
> And, let’s be honest, most developers are also not using Haskell.

Everything in that post applies to the most common programming language out there: TypeScript.

And several popular others such as Rust, Kotlin or Scala.

replies(4): >>35058565 #>>35058974 #>>35059010 #>>35059347 #
1. jakear ◴[] No.35059010[source]
Not quite, TypeScript provides a options beyond what the author of this article details that IMO are superior, at least in some cases. Instead of just "throw an error or return ()" or "throw an error or return NonEmpty<T>", you can declare a function's return type as "throws iff the argument isn't NonEmpty" or "true iff the argument is NonEmpty".

Compare:

    function validateNonEmpty<T>(list: T[]): void {
      if (list[0] === undefined) 
        throw Error("list cannot be empty")
    }

    function parseNonEmpty<T>(list: T[]): [T, ...T[]] {
      if (list[0] !== undefined) {
        return list as [T, ...T[]]
      } else {
        throw Error("list cannot be empty")
      }
    }

    function assertNonEmpty<T>(list: T[]): asserts list is [T, ...T[]] {
      if (list[0] === undefined) throw Error("list cannot be empty")
    }

    function checkEmptiness<T>(list: T[]): list is [T, ...T[]] {
      return list[0] !== undefined
    }

    declare const arr: number[]

    // Error: Object is possibly undefined
    console.log(arr[0].toLocaleString())

    const parsed = parseNonEmpty(arr)
    // No error
    console.log(parsed[0].toLocaleString())

    if (checkEmptiness(arr)) {
      // No error
      console.log(arr[0].toLocaleString())
    }

    assertNonEmpty(arr)
    // No error
    console.log(arr[0].toLocaleString())
For me the `${arg} is ${type}` approach is superior as you are writing the validation once and can pass the precise mechanism for handling of the error to the caller, who tends to have a better idea of what to do in degenerate cases (sometimes throwing a full on Exception is appropriate, but sometimes a different form of recovery is better).
replies(2): >>35059948 #>>35070563 #
2. lexi-lambda ◴[] No.35059948[source]
There is really no difference between doing this and returning a `Maybe`, which is the standard Haskell pattern, except that the `Maybe` result also allows the result to be structurally different rather than simply a refinement of the input type. In a sense, the TypeScript approach is a convenience feature that allows you to write a validation function that returns `Bool`, which normally erases the gained information, yet still preserve the information in the type system.

This is quite nice in situations where the type system already supports the refinement in question (which is true for this NonEmpty example), but it stops working as soon as you need to do something more complicated. I think sometimes programmers using languages where the TS-style approach is idiomatic can get a little hung up on that, since in those cases, they are more likely to blame the type system for being “insufficiently powerful” when in fact it’s just that the convenience feature isn’t sufficient in that particular case. I presented an example of one such situation in this followup blog post: https://lexi-lambda.github.io/blog/2020/08/13/types-as-axiom...

replies(1): >>35070550 #
3. epolanski ◴[] No.35070550[source]
Hi lexi.

Just wanted to say that fp-ts (now effect-ts, a ZIO port to TypeScript) author Giulio Canti is a great fan of your "parse don't validate" article. He's linked it many times in the TypeScript and functional programming channels (such as the fp slack).

Needless to say, both fp-ts-derived io-ts library and effect-ts library schema[1] are obviously quite advanced parsers (and in case of schema, there's decoding, encoding, APIs, guard, arbitrary and many other nice things I haven't seen in any functional language).

[1]https://github.com/Effect-TS/schema

4. epolanski ◴[] No.35070563[source]
You can also simply parse with a type guard in typescript.

Or do something more advanced like implement Decoders/Encoders.