←back to thread

132 points zfnmxt | 1 comments | | HN request time: 0.202s | source
Show context
enisberk ◴[] No.40714384[source]
This is really cool work! Congrats on both the paper and the graduation! A long time ago, I worked on optimizing broadcast operations on GPUs [1]. Coming up with a strategy that promises high throughput across different array dimensionalities is quite challenging. I am looking forward to reading your work.

[1]https://scholar.google.com/citations?view_op=view_citation&h...

replies(1): >>40714514 #
1. zfnmxt ◴[] No.40714514[source]
> Congrats on both the paper and the graduation!

Thanks! Although I still have to actually graduate and the paper is in review, so maybe your congratulations are a bit premature! :)

> A long time ago, I worked on optimizing broadcast operations on GPUs [1].

Something similar happens in Futhark, actually. When something like `[1,2,3] + 4` is elaborated to `map (+) [1,2,3] (rep 4)`, the `rep` is eliminated by pushing the `4` into the `map`: `map (+4) [1,2,3]`. Futhark ultimately then compiles it to efficient CUDA/OpenCL/whatever.