(api-docs.deepseek.com)

101 points meetpateltech | 2 comments | 22 Sep 25 12:20 UTC | HN request time: 0s | source

Show context

binary132 ◴[22 Sep 25 15:46 UTC] No.45335081[source]▶

sure would be neat if these companies would release models that could run on consumer hardware

1. __mharrison__ ◴[22 Sep 25 16:08 UTC] No.45335434[source]▶

I'm using Qwen3Next on my MBP. It uses around 42GB of memory and, according to Aider benchmarks, has similar perf to GPT-4.1

https://huggingface.co/mlx-community/Qwen3-Next-80B-A3B-Inst...

replies(1): >>45354338 #

2. binary132 ◴[23 Sep 25 23:45 UTC] No.45354338[source]▶

>>45335434 (TP) #

Just waiting on llama.cpp support :)

I usually use GPT-oss-120B with CPU MoE offloading. It writes at about 10tps, which is useful enough for the limited things I use it for. But I’m curious how Q3 Next will work (or whether I’ll be able to offload and run it with GPU acceleration at all.)

(4090)

↑

DeepSeek-v3.1-Terminus