←back to thread

1311 points msoad | 1 comments | | HN request time: 0s | source
Show context
jart ◴[] No.35393615[source]
Author here. For additional context, please read https://github.com/ggerganov/llama.cpp/discussions/638#discu... The loading time performance has been a huge win for usability, and folks have been having the most wonderful reactions after using this change. But we don't have a compelling enough theory yet to explain the RAM usage miracle. So please don't get too excited just yet! Yes things are getting more awesome, but like all things in science a small amount of healthy skepticism is warranted.
replies(24): >>35393868 #>>35393942 #>>35394089 #>>35394097 #>>35394107 #>>35394203 #>>35394208 #>>35394244 #>>35394259 #>>35394288 #>>35394408 #>>35394881 #>>35395091 #>>35395249 #>>35395858 #>>35395995 #>>35397318 #>>35397499 #>>35398037 #>>35398083 #>>35398427 #>>35402974 #>>35403334 #>>35468946 #
intelVISA ◴[] No.35394288[source]
Didn't expect to see two titans today: ggerganov AND jart. Can ya'll slow down you make us mortals look bad :')

Seeing such clever use of mmap makes me dread to imagine how much Python spaghetti probably tanks OpenAI's and other "big ML" shops' infra when they should've trusted in zero copy solutions.

Perhaps SWE is dead after all, but LLMs didn't kill it...

replies(11): >>35395112 #>>35395145 #>>35395165 #>>35395404 #>>35396298 #>>35397484 #>>35398972 #>>35399367 #>>35400001 #>>35400090 #>>35456064 #
shakow ◴[] No.35395404[source]
> Perhaps SWE is dead after all, but LLMs didn't kill it...

Cheap electronics did. 32GB of RAM is maybe $150, a developer converting & maintaining your system to use mmap is $150k/year.

replies(2): >>35395692 #>>35397724 #
pdntspa ◴[] No.35397724[source]
So let's toss management and go write good code for the principle of it, and not business bullshit calculus
replies(1): >>35400498 #
shakow ◴[] No.35400498[source]
Good, tell me how your company will be doing.

What people sometimes fail to understand is that code is a mean to an end, not an end in itself.

If you want to make code for itself, work on an opensource and/or personal project. If you are paid to work on something, you're paid for the something to get out, not for it to feature the best code ever.

replies(1): >>35400892 #
1. pdntspa ◴[] No.35400892[source]
With the margins that tech makes, many companies could certainly afford to care more about code quality. But they don't, instead it gets stuffed into cash reserves where the money sits idle, doing nothing but enriching shareholders.

Or hiring useless business people to install around the periphery of engineering. Which is funny because now tech is letting all those folks go.