To actually use a model, you need a context window. Realistically, you'll want a 20GB GPU or larger, depending on how many tokens you need.