←back to thread

172 points marban | 1 comments | | HN request time: 0.212s | source
Show context
Aissen ◴[] No.40052746[source]
A quick search into it shows that this Ryzen AI NPU's support isn't integrated into upstream inference frameworks yet — so right now it's just useless silicon surface you pay for :-/
replies(3): >>40052844 #>>40053100 #>>40060474 #
dhruvdh ◴[] No.40053100[source]
There is a VitisAI execution provider for ONNX, and you can use ONNX backends for inference frameworks that support it. More info here - https://ryzenai.docs.amd.com/en/latest/

But regardless, 16 TOPs is no good for LLMs. Though there is a Ryzen AI demo that shows Llama 7B running on these at 8 tokens/sec. A sub-par experience for a sub-par LLM.

replies(3): >>40054182 #>>40054664 #>>40142456 #
1. markdog12 ◴[] No.40054664[source]
Wow, that's simply embarrassing.