For VSCode I use continue.dev as it allows to set my own (short) system prompt. I get around 50token/sec generation and prompt processing 550t/s.
When giving well defined small tasks, it is as good as any frontier model.
I like the speed and low latency and the availability while on the plane/train or off-grid.
Also decent FIM with the llama.cpp VSCode plugin.
If I need more intelligence my personal favourites are Claude and Deepseek via API.