I use LM Studio with GGUF models running on either my Apple MacBook Air M1 (it’s, ok…) or my Alienware x17 R2 with an RTX 3080 on a Core i9 (runs like autocomplete) in VS Code using Continue.dev
My only complaint is agent mode needs good token gen so I only go agent mode on the RTX machine.
I grew up on 9600baud so I’m cool with watching the text crawl.