(github.com)

1311 points msoad | 1 comments | 31 Mar 23 20:37 UTC | HN request time: 0.207s | source

Show context

detrites ◴[31 Mar 23 20:58 UTC] No.35393558[source]▶

The pace of collaborative OSS development on these projects is amazing, but the rate of optimisations being achieved is almost unbelievable. What has everyone been doing wrong all these years cough sorry, I mean to say weeks?

Ok I answered my own question.

replies(5): >>35393627 #>>35393885 #>>35393921 #>>35394786 #>>35397029 #

xienze ◴[31 Mar 23 22:46 UTC] No.35394786[source]▶

>>35393558 #

> but the rate of optimisations being achieved is almost unbelievable. What has everyone been doing wrong all these years cough sorry, I mean to say weeks?

It’s several things:

* Cutting-edge code, not overly concerned with optimization

* Code written by scientists, who aren’t known for being the world’s greatest programmers

* The obsession the research world has with using Python

Not surprising that there’s a lot of low-hanging fruit that can be optimized.

replies(2): >>35394897 #>>35397540 #

Miraste ◴[31 Mar 23 22:57 UTC] No.35394897[source]▶

>>35394786 #

Why does Python get so much flak for inefficiencies? It's really not that slow, and in ML the speed-sensitive parts are libraries in lower level languages anyway. Half of the optimization from this very post is in Python.

replies(3): >>35395016 #>>35395401 #>>35442394 #

1. Max-Limelihood ◴[04 Apr 23 16:42 UTC] No.35442394[source]▶

>>35394897 #

It really is that slow.

You're completely correct that the speed-sensitive parts are written in lower-level libraries, but another way to phrase that is "Python can go really fast, as long as you don't use Python." But this also means ML is effectively hamstrung into only using methods that already exist and have been coded in C++, since anything in Python would be too slow to compete.

There's lots of languages that make good tradeoffs between performance and usability. Python is not one of those languages. It is, at best, only slightly harder to use than Julia, yet orders-of-magnitude slower.

↑

Llama.cpp 30B runs with only 6GB of RAM now