←back to thread

172 points marban | 6 comments | | HN request time: 2.476s | source | bottom
Show context
bearjaws ◴[] No.40052158[source]
The focus on TOPS seems a bit out of line with reality for LLMs. TOPs doesn't matter for LLMs if your memory bandwidth can't keep up. Since it doesn't have quad channel memory mentioned I guess it's still dual channel?

Even top of the line DDR5 is around 128GB/s vs a M1 at 400GB/s.

At the end of the day, it still seems like AI in consumer chips is chasing a buzzword, what is the killer feature?

On mobile there are image processing benefits and voice to text, translation... but on desktop those are no where near common use cases.

replies(3): >>40052204 #>>40052260 #>>40052353 #
VHRanger ◴[] No.40052260[source]
The killer feature is presumably inference at the edge, but I don't see that being used on desktop much at all right now.

Especially since most desktop applications people use are web apps. Of the native apps people use that leverage this sort of stuff, almost all are GPU accelerated already (eg. image and video editing AI tools)

replies(1): >>40052360 #
1. jzig ◴[] No.40052360[source]
What does “at the edge” mean here?
replies(4): >>40052515 #>>40052529 #>>40052531 #>>40052991 #
2. georgeecollins ◴[] No.40052515[source]
Not using AI on the cloud. So if your connection is uncertain or you want use your bandwidth for something else-- like video conferencing or gaming. Probably the killer app is something that wants to use AI but doesn't involve paying a cloud provider. I was talking to a vendor about their chat bot built to put into MMOs or mobile games. It woudl be killer to have a character have life like conversation in those kinds of experiences. But the last thing you want to do is increase your server costs the way this AI would. Edge computing could solve that.
3. PeterSmit ◴[] No.40052529[source]
Not in the cloud.
4. Zach_the_Lizard ◴[] No.40052531[source]
I'm guessing "the edge" is doing inference work in the browser, etc. as opposed to somewhere in the backend of the web app.

Maybe your local machine can run, I don't know, a model to make suggestions as you're editing a Google Doc, which frees up the Big Machine in the Sky to do other things.

As this becomes more technically feasible, it reduces the effective cost of inference for a new service provider, since you, the client, are now running their code.

The Jevons paradox might kick in, causing more and more uses of LLMs for use cases that were too expensive before.

5. VHRanger ◴[] No.40052991[source]
Edge is doing computing on the client (eg. browser, phone, laptop, etc.) instead of the server
replies(1): >>40055563 #
6. Dylan16807 ◴[] No.40055563[source]
Half the definitions I see of edge include client devices, and half of them don't include client devices.

I like the latter. Why even use a new word if it's just going to be the same as "client"?