"By deploying this implementation locally, it translates to a cost of $0.20/1M output tokens"
Is that just the cost of electricity, or does it include the cost of the GPUs spread out over their predicted lifetime?
replies(3):
Depreciation and GPU failure rate over time must be considered, which I don't see mentioned in the article.