←back to thread

486 points dbreunig | 1 comments | | HN request time: 0.194s | source
Show context
eightysixfour ◴[] No.41863546[source]
I thought the purpose of these things was not to be fast, but to be able to run small models with very little power usage? I have a newer AMD laptop with an NPU, and my power usage doesn't change using the video effects that supposedly run on it, but goes up when using the nvidia studio effects.

It seems like the NPUs are for very optimized models that do small tasks, like eye contact, background blur, autocorrect models, transcription, and OCR. In particular, on Windows, I assumed they were running the full screen OCR (and maybe embeddings for search) for the rewind feature.

replies(7): >>41863632 #>>41863779 #>>41863821 #>>41863886 #>>41864628 #>>41864828 #>>41869772 #
1. monkeynotes ◴[] No.41869772[source]
I believe that low power = cheaper tokens = more affordable and sustainable, to me this is what a consumer will benefit from overall. Power hungry GPUs seem to sit better in research, commerce, and enterprise.

The Nvidia killer would be chips and memory that are affordable enough to run a good enough model on a personal device, like a smartphone.

I think the future of this tech, if the general populace buys into LLMs being useful enough to pay a small premium for the device, is personal models that by their nature provide privacy. The amount of personal information folks unload on ChatGPT and the like is astounding. AI virtual girlfriend apps frequently get fed the most darkest kinks, vulnerable admissions, and maybe even incriminating conversations, according to Redditors that are addicted to these things. This is all given away to no-name companies that stand up apps on the app store.

Google even states that if you turn Gemini history on then they will be able to review anything you talk about.

For complex token prediction that requires a bigger model the personal could switch to consulting a cloud LLM, but privacy really needs to be ensured for consumers.

I don't believe we need cutting edge reasoning, or party trick LLMs for day to day personal assistance, chat, or information discovery.