←back to thread

132 points zfnmxt | 7 comments | | HN request time: 1.451s | source | bottom
Show context
eigenspace ◴[] No.40713949[source]
I’d strongly suggest not going the Numpy route, and instead reading about broadcast in julia [1]. In many contexts, it’s very important to distinguish between the element-wise application of a function, and a function that is applied to an entire array. A classic example would be the matrix exponential[2], versus the element-wise exponential of a matrix.

In julia the former is `exp(M)`, and the latter is `exp.(M)`

[1] https://julialang.org/blog/2017/01/moredots/

[2] https://en.m.wikipedia.org/wiki/Matrix_exponential

replies(4): >>40714080 #>>40714083 #>>40714427 #>>40715920 #
1. zfnmxt ◴[] No.40714083[source]
I'd actually say what we did in Futhark is more similar to what Julia does (in the sense that the dot operator is syntactic sugar for Julia's `broadcast`) than what NumPy does---I wrote "NumPy-style" in the title just because I figured that's what most people would be familiar with. A more accurate title might be "Rank-polymorphic function applications in Futhark".

Ultimately it's quite different from both languages, though. Futhark has a strict static type system and we figure out how to broadcast completely statically without runtime information (NumPy and Julia both do it dynamically). This is really the main challenge that the system addresses and it permits you to have broadcasting in an ML/Hindley-Milner/Haskell-style type system along with full type inference and all the usual jazz and goodies.

Our system also doesn't have anything to do with vectorization/performance optimizations---in Futhark, `[1,2,3] + [4,5,6]` is just syntactic sugar for `map (+) [1,2,3] [4,5,6]` (which is how you'd write it without broadcasting), where `map` is already a parallel construct. So broadcasting in Futhark is literally just syntactic sugar for constructs that are inherently parallel, anyway---anything you can express with broadcasting can already be written in stock Futhark and there's no "magic" going on. Broadcasting in Futhark just means that some `map`s can be implicit, rather than explicit.

So it's a very disciplined and transparent (since it's static you can ask the compiler to show you exactly how the broadcasting will work) approach to broadcasting that the user has a lot of control over.

> it’s very important to distinguish between the element-wise application of a function, and a function that is applied to an entire array

Broadcasting in Futhark isn't inherently "element-wise" in the sense of descending through all the ranks of an argument until you get to the scalar level. It tries to actually use the fewest number of `map`s to get a rank-correct function application, so there's a preference for applying a function to higher-dimensional elements over scalars. This might sound non-intuitive/confusing, but it actually almost always aligns with what the programmer intended.

All it really does is use the minimal number of `map`s (and `rep`s) to massage a function and an argument with a rank mismatch into something that makes sense (or is rank correct)---it doesn't even have a notion of scalars, only rank differences!

replies(1): >>40714232 #
2. eigenspace ◴[] No.40714232[source]
> Futhark has a strict static type system and we figure out how to broadcast completely statically without runtime information (NumPy and Julia both do it dynamically).

Julia’s broadcast doesnt do anything dynamic. If you know the input types of a broadcast expression, you know the output type.

> Broadcasting in Futhark isn't inherently "element-wise" in the sense of descending through all the ranks of an argument until you get to the scalar level.

Julia’s broadcast doesnt do that either. One dot is one level deep.

Julia also doesn't really have a concept of scalars (in fact, our number types are actually zero-dimensional, one-element iterables). When I said ‘elementwise’ I just meant each element of a container, which can in turn be a container.

replies(1): >>40714328 #
3. zfnmxt ◴[] No.40714328[source]
> Julia’s broadcast doesnt do anything dynamic.

When is a broadcast expression transformed (and fused) into vectorized code?

> If you know the input types of a broadcast expression, you know the output type.

Do you always (statically) know the input types? In Futhark, due to parametric polymorphism, you don't necessarily know the input types.

> Julia’s broadcast doesnt do that either. One dot is one level deep.

Doesn't something like `[[1,2,3]] .+ 4` go two levels deep? Or have I misunderstood something?

replies(1): >>40714506 #
4. eigenspace ◴[] No.40714506{3}[source]
> When is a broadcast expression transformed (and fused) into vectorized code?

A broadcast expression is dealt with during lowering from expression trees to the linear IR. The trick is that it just turns into function calls. When you write

    a .+ f.(b)
that turns into

    materialize(broadcasted(+, a, broadcasted(f, b))
where the broadcasted calls are lazy, uninstantiated representations of the call. The `materialize` function then performs the fusion.

As long as you write type-inferrable code, the handling and fusion will all happen during compile time with no dynamism.

> Do you always (statically) know the input types? In Futhark, due to parametric polymorphism, you don't necessarily know the input types.

A method in julia always knows its concrete input types when it is called. Julia basically works by stitching together chunks of static code. So if something dynamic happens in the middle of your program, the compiler just waits until the the types are known at runtime and then resumes, and compiles new static code as far forward as its able to analyze the program.

If your program is fully inferrable, then the whole program is compiled 'just-ahead-of-time' when you first call the outermost function.

> Doesn't something like `[[1,2,3]] .+ 4` go two levels deep? Or have I misunderstood something?

No, that would error in julia.

replies(1): >>40714736 #
5. zfnmxt ◴[] No.40714736{4}[source]
> No, that would error in julia.

Okay, but '[[1,2,3] [4,5,6]] .+ 4` does work! Looks like Julia distinguishes between a two-dimensional array and an array of arrays (which Futhark does not). To me, it feels like semantically '[[1,2,3] [4,5,6]] .+ 4` is two levels deep even if operationally it has some underlying flat representation. (But I'm not familiar with Julia's programming model.)

> The trick is that it just turns into function calls.

Ah, okay. I think `a .+ f.(b)` can roughly be understood in Futhark as `map^N (+) a (rep^M (map^Q f (rep^S b)))` where `map^N` means do N `map`s and you can locally determine N, M, Q, S from the input types. We actually essentially had this in our very first prototype of AUTOMAP but found it insufficient for our purposes because it doesn't play nice with arguments whose types aren't fully known (i.e., that have type variables) and we cannot wait until runtime to figure out those types.

replies(2): >>40716574 #>>40720713 #
6. adgjlsfhk1 ◴[] No.40716574{5}[source]
the part that is confusing you here is actually just [[1,2,3] [4,5,6]]. this is syntax for constructing a matrix (as opposed to [[1,2,3], [4,5,6]] with a comma which makes a vector of vectors). Julia has first class support for n dimensional arrays which is very useful for scientific/mathy code
7. eigenspace ◴[] No.40720713{5}[source]
> Okay, but '[[1,2,3] [4,5,6]] .+ 4` does work! Looks like Julia distinguishes between a two-dimensional array and an array of arrays

Ah, yes, Julia distinguishes between multidimensional arrays, and arrays of arrays.

The thing you wrote is actually syntax sugar for “horizontal concatenation” (hcat), I.e. when one writes `[A B]` that makes a flat matrix. So what you wrote is actually just (mildly confusing) syntax sugar for the matrix

    [1 4
     2 5
     3 6]
which is an array of scalars in julia, not an array of arrays. If you had written `[[1,2,3], [4,5,6]] .+ 4` you’d have an array of arrays, and hence an error.

Given that Furthark chose to have multidimensional arrays being semantically arrays-of-arrays, I see why it’s important for you to automatically descend recursively through multiple levels.

I still think it’s maybe a bit of a footgun that users cant choose at the call-site if their function is broadcasted or not though (that’s the really nice thing about julia’s dot-broadcasting for me anyways).