←back to thread

314 points pretext | 2 comments | | HN request time: 0.47s | source
Show context
terhechte ◴[] No.46220981[source]
Is there a way to run these Omni models on a Macbook quantized via GGUF or MLX? I know I can run it in LMStudio or Llama.cpp but they don't have streaming microphone support or streaming webcam support.

Qwen usually provides example code in Python that requires Cuda and a non-quantized model. I wonder if there is by now a good open source project to support this use case?

replies(2): >>46222558 #>>46222569 #
1. mobilio ◴[] No.46222558[source]
Yes - there is a way: https://github.com/ggml-org/whisper.cpp
replies(1): >>46223198 #
2. novaray ◴[] No.46223198[source]
Whisper and Qwen Omni models have completely different architectures as far as I know