Apart from using a Mac, what can you use for inference with reasonable performance? Is a Mac the only realistic option at the moment?
replies(6):
Where a Mac may beat the above is on the memory side, if a model requires more than 24/32 GB of GPU memory you are usually better off with a Mac with 64/128 GB of RAM. On a Mac the memory is shared between CPU and GPU, so the GPU can load larger models.