What hardware are y'all using when you run these things locally? I was thinking of pre ordering the Framework desktop[0] for this purpose, but I wouldn't mind having a decent laptop that could run it (ideally Linux).
replies(4):
The same page also gives instructions for running the model through VLLM on a GPU, but it doesn't seem like it supports quantization, so it may require multiple GPUs since the instructions say "with at least 2 GPUs".