Most active commenters
  • jart(3)

←back to thread

1311 points msoad | 12 comments | | HN request time: 0.43s | source | bottom
Show context
jart ◴[] No.35393615[source]
Author here. For additional context, please read https://github.com/ggerganov/llama.cpp/discussions/638#discu... The loading time performance has been a huge win for usability, and folks have been having the most wonderful reactions after using this change. But we don't have a compelling enough theory yet to explain the RAM usage miracle. So please don't get too excited just yet! Yes things are getting more awesome, but like all things in science a small amount of healthy skepticism is warranted.
replies(24): >>35393868 #>>35393942 #>>35394089 #>>35394097 #>>35394107 #>>35394203 #>>35394208 #>>35394244 #>>35394259 #>>35394288 #>>35394408 #>>35394881 #>>35395091 #>>35395249 #>>35395858 #>>35395995 #>>35397318 #>>35397499 #>>35398037 #>>35398083 #>>35398427 #>>35402974 #>>35403334 #>>35468946 #
intelVISA ◴[] No.35394288[source]
Didn't expect to see two titans today: ggerganov AND jart. Can ya'll slow down you make us mortals look bad :')

Seeing such clever use of mmap makes me dread to imagine how much Python spaghetti probably tanks OpenAI's and other "big ML" shops' infra when they should've trusted in zero copy solutions.

Perhaps SWE is dead after all, but LLMs didn't kill it...

replies(11): >>35395112 #>>35395145 #>>35395165 #>>35395404 #>>35396298 #>>35397484 #>>35398972 #>>35399367 #>>35400001 #>>35400090 #>>35456064 #
1. shakow ◴[] No.35395404[source]
> Perhaps SWE is dead after all, but LLMs didn't kill it...

Cheap electronics did. 32GB of RAM is maybe $150, a developer converting & maintaining your system to use mmap is $150k/year.

replies(2): >>35395692 #>>35397724 #
2. xmprt ◴[] No.35395692[source]
This still doesn't make sense. It doesn't take a full year to do optimizations like this. Maybe a month at most if you include the investigation time. And the memory usage is $150 times the number of users which is in the thousands at least.
replies(1): >>35395793 #
3. jart ◴[] No.35395793[source]
Tragedy of the commons. If you want to do something that benefits everyone a little bit, and you can't productize it like OpenAI's $20/month subscription, then there's no rational economic reason to do it, and you have to wait for someone like me who has an irrational love of coding. It's not a lifestyle that makes you rich, but it does help you see the opportunities to fix problems that the well-resourced folks who are supposed to be solving them would never even notice; in fact, they'd probably think you're trolling them if you ever brought it up.
replies(2): >>35396016 #>>35399394 #
4. axlee ◴[] No.35396016{3}[source]
Tragedy of the commons only work for things you don't directly pay for.
replies(2): >>35396838 #>>35397255 #
5. nablags ◴[] No.35396838{4}[source]
well in a way - open source software something that you don’t directly pay for
replies(1): >>35397151 #
6. sli ◴[] No.35397151{5}[source]
More saliently, the overwhelming majority of the Linux kernel's direct and extended userbase has contributed nothing at all directly to the Linux kernel, as just one example.
7. ElectricalUnion ◴[] No.35397255{4}[source]
Exactly, the software supplier isn't paying for RAM.
8. pdntspa ◴[] No.35397724[source]
So let's toss management and go write good code for the principle of it, and not business bullshit calculus
replies(1): >>35400498 #
9. somesortofsystm ◴[] No.35399394{3}[source]
>Tragedy of the commons.

Tragedy of folks forgetting how to program.

This mmap() "trick" isn't a trick, its a standard practice for anyone who has cut their teeth on POSIX or embedded. See also mlock()/munlock() ..

replies(1): >>35399782 #
10. jart ◴[] No.35399782{4}[source]
Well that's exactly the thing. They haven't. We're talking about a group of people here who live inside scientific papers and jupyter notebooks. They're able to make machines literally think, but you'd be pushing them out of their comfort zone if you stuck them in front of something like Emacs with C. Some people like GG, Jeff Dean, etc. are strong in both skill sets, but they're outliers.
11. shakow ◴[] No.35400498[source]
Good, tell me how your company will be doing.

What people sometimes fail to understand is that code is a mean to an end, not an end in itself.

If you want to make code for itself, work on an opensource and/or personal project. If you are paid to work on something, you're paid for the something to get out, not for it to feature the best code ever.

replies(1): >>35400892 #
12. pdntspa ◴[] No.35400892{3}[source]
With the margins that tech makes, many companies could certainly afford to care more about code quality. But they don't, instead it gets stuffed into cash reserves where the money sits idle, doing nothing but enriching shareholders.

Or hiring useless business people to install around the periphery of engineering. Which is funny because now tech is letting all those folks go.