(qwenlm.github.io)

544 points tosh | 1 comments | 24 Mar 25 18:35 UTC | HN request time: 0.433s | source

1. michaelt ◴[24 Mar 25 22:34 UTC] No.43466090[source]▶

Has anyone successfully run a quantized version of any of the Qwen2.5-VL series of models?

I've run the smallest model in non-quantized format, but when I've tried to run a AWQ version of one of the bigger models I've struggled to find a combination of libraries that works right - even though it should fit on my GPU.

↑

Qwen2.5-VL-32B: Smarter and Lighter