←back to thread

I don't like NumPy

(dynomight.net)
480 points MinimalAction | 10 comments | | HN request time: 0.222s | source | bottom
Show context
zahlman ◴[] No.43998076[source]
TFA keeps repeating "you can't use loops", but aren't they, like, merely less performant? I understand that there are going to be people out there doing complex algorithms (perhaps part of an ML system) where that performance is crucial and you might as well not be using NumPy in the first place if you skip any opportunities to do things in The Clever NumPy Way. But say I'm just, like, processing a video frame by frame, by using TCNW on each frame and iterating over the time dimension; surely that won't matter?

Also: TIL you can apparently use multi-dimensional NumPy arrays as NumPy array indexers, and they don't just collapse into 1-dimensional iterables. I expected `A[:,i,j,:]` not to work, or to be the same as if `j` were just `(0, 1)`. But instead, it apparently causes transposition with the previous dimension... ?

replies(4): >>43998113 #>>43998230 #>>43998333 #>>43998354 #
1. nyeah ◴[] No.43998113[source]
Right, you can use loops. But then it goes much slower than a GPU permits.
replies(2): >>43998152 #>>43999459 #
2. zahlman ◴[] No.43998152[source]
My point is that plenty of people use NumPy for reasons that have nothing to do with a GPU.
replies(2): >>43998264 #>>43998653 #
3. crazygringo ◴[] No.43998264[source]
The whole point of NumPy is to make things much, much faster than interpreted Python, whether you're GPU-accelerated or not.

Even code you write now, you may need to GPU accelerate later, as your simulations grow.

Falling back on loops is against the entire reason of using NumPy in the first place.

replies(2): >>43998641 #>>44000519 #
4. nyeah ◴[] No.43998641{3}[source]
I really disagree. That's not the only point of NumPy. A lot of people use it like Matlab, to answer questions with minimal coding time, not minimal runtime.
replies(1): >>43999001 #
5. nyeah ◴[] No.43998653[source]
I mean yes. Also in your example where you hardly spend any time running Python code, the performance difference likely wouldn't matter.
6. crazygringo ◴[] No.43999001{4}[source]
I mean sure, the fact that it is performant means tons of functionality is built on it that is hard to find elsewhere.

But the point is still that the main purpose in building it was to be performant. To be accelerated. Even if that's not why you're personally using it.

I mean, I use my M4 Mac's Spotlight to do simple arithmetic. That's not the main point in building the M4 chip though.

replies(2): >>43999309 #>>43999360 #
7. ◴[] No.43999309{5}[source]
8. nyeah ◴[] No.43999360{5}[source]
As best I can see, you weren't originally talking about the reason for creating NumPy. Instead you seemed to be talking about the reason for using NumPy.
9. cycomanic ◴[] No.43999459[source]
But once you need to use the GPU you need to go to another framework anyway (e.g. jax, tensorflow, arrayfire, numba...). AFAIK many of those can parallise loops using their jit functionality (in fact, e.g. numbas jit for a long time could not deal with numpy broadcasing, so you had to write out your loops). So you're not really running into a problem?
10. zahlman ◴[] No.44000519{3}[source]
I used it once purely because I figured out that "turn a sequence of per-tile bitmap data into tiles, then produce the image corresponding to those tiles in order" is equivalent to swapping the two inner dimensions of a four-dimensional array. (And, therefore, so is the opposite operation.) The task was extremely non-performance-critical.

Of course I wasn't happy about bringing in a massive dependency just to simplify a few lines of code. Hopefully one day I'll have a slimmer alternative, perhaps one that isn't particularly concerned with optimization.