Most active commenters

    ←back to thread

    Parse, don't validate (2019)

    (lexi-lambda.github.io)
    398 points declanhaigh | 21 comments | | HN request time: 1.413s | source | bottom
    Show context
    bruce343434 ◴[] No.35053912[source]
    Note that this basically requires your language to have ergonomic support for sum types, immutable "data classes", pattern matching.

    The point is to parse the input into a structure which always upholds the predicates you care about so you don't end up continuously defensively programming in ifs and asserts.

    replies(12): >>35054046 #>>35054070 #>>35054386 #>>35054514 #>>35054901 #>>35054993 #>>35055124 #>>35055230 #>>35056047 #>>35057866 #>>35058185 #>>35059271 #
    1. jim-jim-jim ◴[] No.35054070[source]
    I think we'll eventually come to regard `if` as we do `goto`.
    replies(4): >>35054298 #>>35054351 #>>35054456 #>>35054814 #
    2. leetrout ◴[] No.35054298[source]
    Using pattern matching instead or something else?
    replies(3): >>35054371 #>>35054472 #>>35063455 #
    3. quchen ◴[] No.35054351[source]
    If is semantically the only way to deconstruct a Boolean in any language, so as long as you have bools, you’re going to have `if`. Sure you can give if different syntax and write it with match/case/?:/whatever, but that’s not what we did to goto: introducing different language constructs to capture common useful use cases like try/catch, loops, and else-less ifs.
    replies(3): >>35054547 #>>35054918 #>>35055288 #
    4. Joeri ◴[] No.35054371[source]
    By using reactive programming techniques the program can be approached as a set of data streams mapping input to output, and conditional behavior becomes the application of different filters and combiners on the streams. This dovetails nicely with functional programming, which allows generic expression and reuse of those stream operations.
    5. raincole ◴[] No.35054456[source]
    I don't know about this. We all have seen this kind of code:

        if(!needToDoTheThing()) return;
        
        DoTheThing();
    
    
    We could have written it this way:

        if(needToDoTheThing()) {
            DoTheThing();
        }
        else {
            return;
        }
    
    The later is closer to how pattern match looks like. But in my experience, the majority of programmers prefer early return. I regularly see people "refactor" if-else to if-early-return, but I've never seen the opposite.
    replies(4): >>35054651 #>>35054833 #>>35065147 #>>35065313 #
    6. oslac ◴[] No.35054472[source]
    Not a perfect example, but this can be seen (pattern match replacing if) with Kotlin's when.
    7. palotasb ◴[] No.35054547[source]
    To nitpick and to show a cool lambda calculus thing, you can deconstruct booleans if you define booleans and if statements the following way, using only pure functions.

      def TRUE(a, b):
        return a
    
      def FALSE(a, b):
        return b
    
      def IF(cond, a, b):
        return cond(a, b)
    
      assert IF(TRUE, 1, 2) == 1
      assert IF(FALSE, 1, 2) == 2
    
    This gives you the conditional statement in most languages ("cond ? a : b" or "a if cond else b").
    replies(2): >>35054736 #>>35054747 #
    8. RHSeeger ◴[] No.35054651[source]
    I prefer the former. It separates the pre-conditions from the algorithm/logic, using gate clauses. I find this makes it easier to reason about the algorithm.
    replies(1): >>35055047 #
    9. quchen ◴[] No.35054736{3}[source]
    Church encoding is pretty cool, yes! It encodes Booleans such that »if == id«. Likewise, natural numbers are essentially counting for-loops: 3 f x = f (f (f x)), so »for == id«.

    I had to work with this for a while because I wanted to implement Hello World in Javascript. https://github.com/quchen/lambda-ski/blob/master/helloworld/...

    10. Joker_vD ◴[] No.35054747{3}[source]
    You can do this same trick with any algebraic type, honestly (modulo lazyness):

        ## type Silly = Foo | Bar Int | Qux String Silly
    
        ## Constructors
    
        def Foo(onFoo, onBar, onQux):
            return onFoo()
    
        def Bar(arg0):
            return lambda onFoo, onBar, onQux: onBar(arg0)
    
        def Qux(arg0, arg1):
            return lambda onFoo, onBar, onQux: onQux(arg0, arg1)
    
        ## Values of Silly type are Foo, Bar(x) and Qux(x, y)
    
        ## Destructor
    
        def match_Silly(silly, onFoo, onBar, onQux):
            return silly(onFoo, onBar, onQux)
    
    You can make a whole language on top of that if you don't mind effectively disabling your CPU's branch predictor.
    11. randomdata ◴[] No.35054814[source]
    Bridled, so that it doesn’t suffer the problems Dijkstra spoke of? Wouldn’t you say they both already are in modern languages?
    12. pjc50 ◴[] No.35054833[source]
    It keeps the code closer to the left. It also keeps it conceptually simpler if you can discard a bunch of "obvious" cases early on.
    replies(1): >>35055215 #
    13. chriswarbo ◴[] No.35054918[source]
    I agree; although there's a related problem of "boolean blindness" https://existentialtype.wordpress.com/2011/03/15/boolean-bli...

    I'd summarise boolean blindness as: implicit (often unsafe) coupling/dependencies of method results; which could instead be explicit data dependencies. That article's example is 'plus x y = if x=Z then y else S(plus (pred x) y)', which uses an unsafe 'pred' call that crashes when x is 'Z'. It avoids the crash by branching on an 'x=Z' comparison. The alternative is to pattern-match on x, to get 'Z' or 'S x2'; hence avoiding the need for 'pred'.

    Another alternative is to have 'pred' return 'Maybe Nat'; although that's less useful when we have more constructors and more data (e.g. the 'NonEmptyList' in this "parse, don't validate" article!)

    14. Timon3 ◴[] No.35055047{3}[source]
    It's much nicer, especially since it keeps the complexity down.

    If you nest if/else, you'll quickly approach a point where you have to keep a complex logic tree in your head to determine which states the system could be in inside of any given branch. If you use guard clauses and return early, you'll keep this complexity down to a minimum, since the list of possible states changes linearly with your code instead of exponentially.

    I know not everybody likes it, but I think this makes cyclomatic complexity an extremely valuable metric for measuring "ease-of-reading".

    15. jakelazaroff ◴[] No.35055215{3}[source]
    Yup, this is my exact rationale for preferring this too. Branches are a significant source of complexity and early returns are one way to tame it — have the “meat” of the function deal with as few invariants as possible.
    16. jakelazaroff ◴[] No.35055288[source]
    There are concepts like filtering that let you operate on booleans without branching:

       const published = posts.filter(post => !post.draft);
    17. lmm ◴[] No.35063455[source]
    Most of the time you avoid having booleans in the first case, in favour of polymorphism (e.g. rather than having an "addOrMultiply" flag, you have separate "Add" and "Multiply" classes with a polymorphic method that does the addition or multiplication). You probably need some conditional logic in your "parser" (and whether that's "if" or pattern matching isn't so important IMO), but you should push booleans out of your core business logic and over to the edges of your program.
    replies(1): >>35071493 #
    18. quickthrower2 ◴[] No.35065147[source]
    The second looks a lot more elegant in Haskell though. Funny how syntax and influence choice of semantics!
    19. ParetoOptimal ◴[] No.35065313[source]
    I prefer using early return in monads with guard like:

        safeDiv :: (Monad m, Alternative m) => Int -> Int -> m Int
        safeDiv x y = do
          guard (y /= 0)
          pure (x `div` y)
    
        main :: IO ()
        main = do
          print $ safeDiv @Maybe 1 0
          print $ safeDiv @[] 1 0
          -- print =<< safeDiv @IO 1 0 -- guard throws an error in IO
    
    Try it out at https://play.haskell.org/saved/a6VsE3uQ
    20. leetrout ◴[] No.35071493{3}[source]
    That sounds miserable. Is there blog post or something with more details that supports this? I might be having a knee jerk reaction because I can't imagine something like this being easy to work with and maintain but I recognize you were just giving a trivial example.
    replies(1): >>35074824 #
    21. lmm ◴[] No.35074824{4}[source]
    This was a blog example I saw a few years ago, it went through making a calculator program without using ifs. Looked pretty nice. I can't find it now though.