←back to thread

1311 points msoad | 2 comments | | HN request time: 0s | source
Show context
jart ◴[] No.35393615[source]
Author here. For additional context, please read https://github.com/ggerganov/llama.cpp/discussions/638#discu... The loading time performance has been a huge win for usability, and folks have been having the most wonderful reactions after using this change. But we don't have a compelling enough theory yet to explain the RAM usage miracle. So please don't get too excited just yet! Yes things are getting more awesome, but like all things in science a small amount of healthy skepticism is warranted.
replies(24): >>35393868 #>>35393942 #>>35394089 #>>35394097 #>>35394107 #>>35394203 #>>35394208 #>>35394244 #>>35394259 #>>35394288 #>>35394408 #>>35394881 #>>35395091 #>>35395249 #>>35395858 #>>35395995 #>>35397318 #>>35397499 #>>35398037 #>>35398083 #>>35398427 #>>35402974 #>>35403334 #>>35468946 #
intelVISA ◴[] No.35394288[source]
Didn't expect to see two titans today: ggerganov AND jart. Can ya'll slow down you make us mortals look bad :')

Seeing such clever use of mmap makes me dread to imagine how much Python spaghetti probably tanks OpenAI's and other "big ML" shops' infra when they should've trusted in zero copy solutions.

Perhaps SWE is dead after all, but LLMs didn't kill it...

replies(11): >>35395112 #>>35395145 #>>35395165 #>>35395404 #>>35396298 #>>35397484 #>>35398972 #>>35399367 #>>35400001 #>>35400090 #>>35456064 #
shakow ◴[] No.35395404[source]
> Perhaps SWE is dead after all, but LLMs didn't kill it...

Cheap electronics did. 32GB of RAM is maybe $150, a developer converting & maintaining your system to use mmap is $150k/year.

replies(2): >>35395692 #>>35397724 #
xmprt ◴[] No.35395692[source]
This still doesn't make sense. It doesn't take a full year to do optimizations like this. Maybe a month at most if you include the investigation time. And the memory usage is $150 times the number of users which is in the thousands at least.
replies(1): >>35395793 #
jart ◴[] No.35395793[source]
Tragedy of the commons. If you want to do something that benefits everyone a little bit, and you can't productize it like OpenAI's $20/month subscription, then there's no rational economic reason to do it, and you have to wait for someone like me who has an irrational love of coding. It's not a lifestyle that makes you rich, but it does help you see the opportunities to fix problems that the well-resourced folks who are supposed to be solving them would never even notice; in fact, they'd probably think you're trolling them if you ever brought it up.
replies(2): >>35396016 #>>35399394 #
axlee ◴[] No.35396016[source]
Tragedy of the commons only work for things you don't directly pay for.
replies(2): >>35396838 #>>35397255 #
1. nablags ◴[] No.35396838{4}[source]
well in a way - open source software something that you don’t directly pay for
replies(1): >>35397151 #
2. sli ◴[] No.35397151[source]
More saliently, the overwhelming majority of the Linux kernel's direct and extended userbase has contributed nothing at all directly to the Linux kernel, as just one example.