←back to thread

NumPy-style broadcasting in Futhark

(futhark-lang.org)

132 points zfnmxt | 1 comments | 17 Jun 24 10:38 UTC | HN request time: 0.206s | source

Show context

enisberk ◴[18 Jun 24 05:42 UTC] No.40714384[source]▶

>>40704179 (OP) #

This is really cool work! Congrats on both the paper and the graduation! A long time ago, I worked on optimizing broadcast operations on GPUs [1]. Coming up with a strategy that promises high throughput across different array dimensionalities is quite challenging. I am looking forward to reading your work.

[1]https://scholar.google.com/citations?view_op=view_citation&h...

replies(1): >>40714514 #

1. zfnmxt ◴[18 Jun 24 06:06 UTC] No.40714514[source]▶

> Congrats on both the paper and the graduation!

Thanks! Although I still have to actually graduate and the paper is in review, so maybe your congratulations are a bit premature! :)

> A long time ago, I worked on optimizing broadcast operations on GPUs [1].

Something similar happens in Futhark, actually. When something like `[1,2,3] + 4` is elaborated to `map (+) [1,2,3] (rep 4)`, the `rep` is eliminated by pushing the `4` into the `map`: `map (+4) [1,2,3]`. Futhark ultimately then compiles it to efficient CUDA/OpenCL/whatever.