←back to thread

268 points Areibman | 1 comments | | HN request time: 0.241s | source

Hey HN! Tokencost is a utility library for estimating LLM costs. There are hundreds of different models now, and they all have their own pricing schemes. It’s difficult to keep up with the pricing changes, and it’s even more difficult to estimate how much your prompts and completions will cost until you see the bill.

Tokencost works by counting the number of tokens in prompt and completion messages and multiplying that number by the corresponding model cost. Under the hood, it’s really just a simple cost dictionary and some utility functions for getting the prices right. It also accounts for different tokenizers and float precision errors.

Surprisingly, most model providers don't actually report how much you spend until your bills arrive. We built Tokencost internally at AgentOps to help users track agent spend, and we decided to open source it to help developers avoid nasty bills.

Show context
yumaueno ◴[] No.40717022[source]
What a nice product! I think the way to count tokens depends on the language, but is this only supported in English?
replies(1): >>40717214 #
1. lgessler ◴[] No.40717214[source]
Most LLMs determine their token inventories by using byte-pair encoding, which algorithmically induces sub-word tokens from a body of text. So even in English you might see a word like "proselytization" tokenized apart into "_pro", "selyt", "iz", "ation", and non-English languages will probably (depending on their proportional representation in the training corpus) also receive token allocations in the BPE vocabulary.

Here's actual output from the GPT-4o tokenizer for English and Hindi:

    >>> [enc.decode([x]) for x in enc.encode("proselytization")]
    ['pros', 'ely', 't', 'ization']
    >>> [enc.decode([x]) for x in enc.encode("पर्यावरणवाद")]
    ['पर', '्य', 'ावरण', 'वाद']