NumPy-style broadcasting in Futhark

(futhark-lang.org)

1. eigenspace ◴[18 Jun 24 04:01 UTC] No.40713949[source]▶

I’d strongly suggest not going the Numpy route, and instead reading about broadcast in julia [1]. In many contexts, it’s very important to distinguish between the element-wise application of a function, and a function that is applied to an entire array. A classic example would be the matrix exponential[2], versus the element-wise exponential of a matrix.

In julia the former is `exp(M)`, and the latter is `exp.(M)`

[1] https://julialang.org/blog/2017/01/moredots/

[2] https://en.m.wikipedia.org/wiki/Matrix_exponential

replies(4): >>40714080 #>>40714083 #>>40714427 #>>40715920 #

2. lalaithion ◴[18 Jun 24 04:29 UTC] No.40714080[source]▶

>>40713949 #

Futhark is a statically typed language, so it can’t have the same ambiguity.

replies(1): >>40714241 #

3. zfnmxt ◴[18 Jun 24 04:29 UTC] No.40714083[source]▶

>>40713949 #

I'd actually say what we did in Futhark is more similar to what Julia does (in the sense that the dot operator is syntactic sugar for Julia's `broadcast`) than what NumPy does---I wrote "NumPy-style" in the title just because I figured that's what most people would be familiar with. A more accurate title might be "Rank-polymorphic function applications in Futhark".

Ultimately it's quite different from both languages, though. Futhark has a strict static type system and we figure out how to broadcast completely statically without runtime information (NumPy and Julia both do it dynamically). This is really the main challenge that the system addresses and it permits you to have broadcasting in an ML/Hindley-Milner/Haskell-style type system along with full type inference and all the usual jazz and goodies.

Our system also doesn't have anything to do with vectorization/performance optimizations---in Futhark, `[1,2,3] + [4,5,6]` is just syntactic sugar for `map (+) [1,2,3] [4,5,6]` (which is how you'd write it without broadcasting), where `map` is already a parallel construct. So broadcasting in Futhark is literally just syntactic sugar for constructs that are inherently parallel, anyway---anything you can express with broadcasting can already be written in stock Futhark and there's no "magic" going on. Broadcasting in Futhark just means that some `map`s can be implicit, rather than explicit.

So it's a very disciplined and transparent (since it's static you can ask the compiler to show you exactly how the broadcasting will work) approach to broadcasting that the user has a lot of control over.

> it’s very important to distinguish between the element-wise application of a function, and a function that is applied to an entire array

Broadcasting in Futhark isn't inherently "element-wise" in the sense of descending through all the ranks of an argument until you get to the scalar level. It tries to actually use the fewest number of `map`s to get a rank-correct function application, so there's a preference for applying a function to higher-dimensional elements over scalars. This might sound non-intuitive/confusing, but it actually almost always aligns with what the programmer intended.

All it really does is use the minimal number of `map`s (and `rep`s) to massage a function and an argument with a rank mismatch into something that makes sense (or is rank correct)---it doesn't even have a notion of scalars, only rank differences!

replies(1): >>40714232 #

4. eigenspace ◴[18 Jun 24 05:06 UTC] No.40714232{3}[source]▶

>>40714083 #

> Futhark has a strict static type system and we figure out how to broadcast completely statically without runtime information (NumPy and Julia both do it dynamically).

Julia’s broadcast doesnt do anything dynamic. If you know the input types of a broadcast expression, you know the output type.

> Broadcasting in Futhark isn't inherently "element-wise" in the sense of descending through all the ranks of an argument until you get to the scalar level.

Julia’s broadcast doesnt do that either. One dot is one level deep.

Julia also doesn't really have a concept of scalars (in fact, our number types are actually zero-dimensional, one-element iterables). When I said ‘elementwise’ I just meant each element of a container, which can in turn be a container.

replies(1): >>40714328 #

5. eigenspace ◴[18 Jun 24 05:09 UTC] No.40714241{3}[source]▶

>>40714080 #

What ambiguity?

6. zfnmxt ◴[18 Jun 24 05:28 UTC] No.40714328{4}[source]▶

>>40714232 #

> Julia’s broadcast doesnt do anything dynamic.

When is a broadcast expression transformed (and fused) into vectorized code?

> If you know the input types of a broadcast expression, you know the output type.

Do you always (statically) know the input types? In Futhark, due to parametric polymorphism, you don't necessarily know the input types.

> Julia’s broadcast doesnt do that either. One dot is one level deep.

Doesn't something like `[[1,2,3]] .+ 4` go two levels deep? Or have I misunderstood something?

replies(1): >>40714506 #

7. enisberk ◴[18 Jun 24 05:42 UTC] No.40714384[source]▶

>>40704179 (OP) #

This is really cool work! Congrats on both the paper and the graduation! A long time ago, I worked on optimizing broadcast operations on GPUs [1]. Coming up with a strategy that promises high throughput across different array dimensionalities is quite challenging. I am looking forward to reading your work.

[1]https://scholar.google.com/citations?view_op=view_citation&h...

replies(1): >>40714514 #

8. xcdzvyn ◴[18 Jun 24 05:45 UTC] No.40714399[source]▶

>>40704179 (OP) #

I've seen Futhark on the front page a few times recently, has there been a resurgence of interest for some reason?

Exciting if so. I think it's a cool language

9. cycomanic ◴[18 Jun 24 05:51 UTC] No.40714427[source]▶

>>40713949 #

I actually think Julia's behaviour (partially) copied from matlab is bad. I had to review a lot of matlab/Julia simulation code write by PhD/grad students (as well as my own) and most of the time when things somehow are not working debugging becomes a task of "find the missing dot". If they would have decided on a more obvious symbol (the dot is really just way too easy to overlook) for broadcasting it would be much better, but I still much prefer that broadcast and matrix operation would use entirely different operators/named functions. Instead we have the case that a little annotation can dramatically change the behaviour of the code (the matrix exponential is in fact a great example).

That said I mainly work with fields and calculus so I might be biased because the majority of operations are broadcast and I only rarely use matrix operations.

replies(2): >>40715052 #>>40715192 #

10. eigenspace ◴[18 Jun 24 06:03 UTC] No.40714506{5}[source]▶

>>40714328 #

> When is a broadcast expression transformed (and fused) into vectorized code?

A broadcast expression is dealt with during lowering from expression trees to the linear IR. The trick is that it just turns into function calls. When you write

    a .+ f.(b)

that turns into

    materialize(broadcasted(+, a, broadcasted(f, b))

where the broadcasted calls are lazy, uninstantiated representations of the call. The `materialize` function then performs the fusion.

As long as you write type-inferrable code, the handling and fusion will all happen during compile time with no dynamism.

> Do you always (statically) know the input types? In Futhark, due to parametric polymorphism, you don't necessarily know the input types.

A method in julia always knows its concrete input types when it is called. Julia basically works by stitching together chunks of static code. So if something dynamic happens in the middle of your program, the compiler just waits until the the types are known at runtime and then resumes, and compiles new static code as far forward as its able to analyze the program.

If your program is fully inferrable, then the whole program is compiled 'just-ahead-of-time' when you first call the outermost function.

> Doesn't something like `[[1,2,3]] .+ 4` go two levels deep? Or have I misunderstood something?

No, that would error in julia.

replies(1): >>40714736 #

11. zfnmxt ◴[18 Jun 24 06:06 UTC] No.40714514[source]▶

>>40714384 #

> Congrats on both the paper and the graduation!

Thanks! Although I still have to actually graduate and the paper is in review, so maybe your congratulations are a bit premature! :)

> A long time ago, I worked on optimizing broadcast operations on GPUs [1].

Something similar happens in Futhark, actually. When something like `[1,2,3] + 4` is elaborated to `map (+) [1,2,3] (rep 4)`, the `rep` is eliminated by pushing the `4` into the `map`: `map (+4) [1,2,3]`. Futhark ultimately then compiles it to efficient CUDA/OpenCL/whatever.

12. zfnmxt ◴[18 Jun 24 06:43 UTC] No.40714736{6}[source]▶

>>40714506 #

> No, that would error in julia.

Okay, but '[[1,2,3] [4,5,6]] .+ 4` does work! Looks like Julia distinguishes between a two-dimensional array and an array of arrays (which Futhark does not). To me, it feels like semantically '[[1,2,3] [4,5,6]] .+ 4` is two levels deep even if operationally it has some underlying flat representation. (But I'm not familiar with Julia's programming model.)

> The trick is that it just turns into function calls.

Ah, okay. I think `a .+ f.(b)` can roughly be understood in Futhark as `map^N (+) a (rep^M (map^Q f (rep^S b)))` where `map^N` means do N `map`s and you can locally determine N, M, Q, S from the input types. We actually essentially had this in our very first prototype of AUTOMAP but found it insufficient for our purposes because it doesn't play nice with arguments whose types aren't fully known (i.e., that have type variables) and we cannot wait until runtime to figure out those types.

replies(2): >>40716574 #>>40720713 #

13. stabbles ◴[18 Jun 24 07:01 UTC] No.40714856[source]▶

>>40704179 (OP) #

Somewhat confusing to see a matrix represented as an array of arrays, it will never be fast.

replies(1): >>40714907 #

14. tomsmeding ◴[18 Jun 24 07:08 UTC] No.40714907[source]▶

>>40714856 #

This is in the surface language only: Futhark's arrays must be "rectangular" (if you have an array of arrays, the subarrays are mandated to be equally-sized; this is statically checked in most cases, and when it can't, it becomes a runtime check), and thus they are represented as a single, actual multi-dimensional array.

replies(1): >>40716597 #

15. tomsmeding ◴[18 Jun 24 07:34 UTC] No.40715052{3}[source]▶

>>40714427 #

The nice thing is that if this dot-approach (or some other syntax with a similar effect) were implemented in Futhark, then because Futhark is statically-typed, all mismatches would be compile-time errors. Those are very hard to overlook :)

replies(1): >>40715159 #

16. setopt ◴[18 Jun 24 07:50 UTC] No.40715159{4}[source]▶

>>40715052 #

In the very common case of square matrices, I think the dotted and non-dotted operations are generally both valid?

replies(1): >>40718678 #

17. setopt ◴[18 Jun 24 07:57 UTC] No.40715192{3}[source]▶

>>40714427 #

> most of the time when things somehow are not working debugging becomes a task of "find the missing dot". If they would have decided on a more obvious symbo

I guess you could use your editor’s highlight-regexp function to make the dots more obvious during debugging :)

I personally like the dots because they’re so unintrusive, but I work in a field where most operations are matrix operations, so perhaps you’re right…

18. rscho ◴[18 Jun 24 10:11 UTC] No.40715920[source]▶

>>40713949 #

I'd strongly advise taking a page from J's book and allow rank specification for function application. Julia having ranks element and infinite is nice, but what about application row-by-row ? On on regular sub arrays?

replies(1): >>40715973 #

19. Athas ◴[18 Jun 24 10:19 UTC] No.40715973{3}[source]▶

>>40715920 #

You can just use 'map' explicitly. It doesn't have to be special syntax.

20. adgjlsfhk1 ◴[18 Jun 24 11:40 UTC] No.40716574{7}[source]▶

>>40714736 #

the part that is confusing you here is actually just [[1,2,3] [4,5,6]]. this is syntax for constructing a matrix (as opposed to [[1,2,3], [4,5,6]] with a comma which makes a vector of vectors). Julia has first class support for n dimensional arrays which is very useful for scientific/mathy code

21. adgjlsfhk1 ◴[18 Jun 24 11:43 UTC] No.40716597{3}[source]▶

>>40714907 #

imo that seems like some pretty weird semantics. how do you write an operation that needs to bundle together lists of different lengths?

replies(1): >>40717028 #

22. Athas ◴[18 Jun 24 12:29 UTC] No.40717028{4}[source]▶

>>40716597 #

The best answer is that you probably don't want to use Futhark for that - it's a specialised, rather restricted language. There are ways to encode irregular arrays, but it's not even remotely as ergonomic as in other languages (it can be plenty fast, however). For a particularly simple example, see https://futhark-lang.org/examples/triangular.html

In the future, or in a future similar language, one might well imagine an ability to "box" arrays, like in APL, which would allow true "arrays of arrays" with no restrictions on sizes.

23. iav ◴[18 Jun 24 15:09 UTC] No.40718678{5}[source]▶

>>40715159 #

They might be both valid but will generate a different output (int or range) leading to a compile time error downstream

replies(1): >>40721744 #

24. mindcrime ◴[18 Jun 24 16:38 UTC] No.40719719[source]▶

>>40704179 (OP) #

Apologies in advance for the slightly off-topic comment, but does anybody else find the name "Futhark" a bit.. erm... I dunno, "off-putting" for lack of a better term? What I mean is, I don't work with the language or anything, so it's never "top of mind" for me. And every time I see it show up on HN (or whatever) for some reason when I see that name something "pattern matches" and makes me think it's some eso-lang ala INTERCAL, Befunge, Malbolge, etc. It's only if I click through on the link and do some reading that I'm reminded that "Oh, this is something serious."

Probably it's just a me thing, but I'd be curious to know if anybody else experiences this with regards to Futhark.

replies(1): >>40719898 #

25. digging ◴[18 Jun 24 16:59 UTC] No.40719898[source]▶

>>40719719 #

First I'm hearing of the programming language, but Futhark is the name given to several forms of the Old Norse runic alphabet, not made up wholesale or derived from other less-serious names.

26. jrevels ◴[18 Jun 24 18:20 UTC] No.40720616[source]▶

>>40704179 (OP) #

Cool stuff! Thanks for the write-up. Getting broadcast ergonomics right in a fully statically typed context seems both tricky and pretty satisfying.

> We thought it’d be a fun/simple feature and a good break from the heavy duty work on automatic differentiation we had just published

Perhaps it wasn't as much of a break as you might think - funnily enough, static broadcast semantics can serve as an interesting structural foothold for certain AD optimizations [1]

(disclaimer: I'm a coauthor on that paper :) but it's been quite a while since I've done work in the AD space)

[1] https://arxiv.org/pdf/1810.08297

27. eigenspace ◴[18 Jun 24 18:32 UTC] No.40720713{7}[source]▶

>>40714736 #

> Okay, but '[[1,2,3] [4,5,6]] .+ 4` does work! Looks like Julia distinguishes between a two-dimensional array and an array of arrays

Ah, yes, Julia distinguishes between multidimensional arrays, and arrays of arrays.

The thing you wrote is actually syntax sugar for “horizontal concatenation” (hcat), I.e. when one writes `[A B]` that makes a flat matrix. So what you wrote is actually just (mildly confusing) syntax sugar for the matrix

    [1 4
     2 5
     3 6]

which is an array of scalars in julia, not an array of arrays. If you had written `[[1,2,3], [4,5,6]] .+ 4` you’d have an array of arrays, and hence an error.

Given that Furthark chose to have multidimensional arrays being semantically arrays-of-arrays, I see why it’s important for you to automatically descend recursively through multiple levels.

I still think it’s maybe a bit of a footgun that users cant choose at the call-site if their function is broadcasted or not though (that’s the really nice thing about julia’s dot-broadcasting for me anyways).

28. cycomanic ◴[18 Jun 24 20:11 UTC] No.40721744{6}[source]▶

>>40718678 #

I don't think that's generally the case, AFAIK the example of the matrix exponential example a bit further up would not be really distinguishable as a type from a regular matrix (with exponential elements).