Has anyone successfully run a quantized version of any of the Qwen2.5-VL series of models?
I've run the smallest model in non-quantized format, but when I've tried to run a AWQ version of one of the bigger models I've struggled to find a combination of libraries that works right - even though it should fit on my GPU.