←back to thread

DeepSeek-v3.1-Terminus

(api-docs.deepseek.com)
101 points meetpateltech | 2 comments | | HN request time: 0s | source
Show context
binary132 ◴[] No.45335081[source]
sure would be neat if these companies would release models that could run on consumer hardware
replies(2): >>45335259 #>>45335434 #
1. __mharrison__ ◴[] No.45335434[source]
I'm using Qwen3Next on my MBP. It uses around 42GB of memory and, according to Aider benchmarks, has similar perf to GPT-4.1

https://huggingface.co/mlx-community/Qwen3-Next-80B-A3B-Inst...

replies(1): >>45354338 #
2. binary132 ◴[] No.45354338[source]
Just waiting on llama.cpp support :)

I usually use GPT-oss-120B with CPU MoE offloading. It writes at about 10tps, which is useful enough for the limited things I use it for. But I’m curious how Q3 Next will work (or whether I’ll be able to offload and run it with GPU acceleration at all.)

(4090)