Show HN: Token price calculator for 400+ LLMs

1. J_Shelby_J ◴[18 Jun 24 14:17 UTC] No.40718153[source]▶

I’m not sure if the python tiktoken library has the cl200k tokenizer for gpt-4o, but I would imagine it does. So this library does support gpt-4o at least.

replies(1): >>40718266 #

2. refulgentis ◴[18 Jun 24 14:29 UTC] No.40718266[source]▶

>>40718153 (TP) #

Yes it does, and no it doesn't.

It is exactly as bad of a situation as I laid out.

It is a tiktoken wrapper that only does CL100K, doesn't bother with anything beyond that, even the message frame tokens, and claims to calculate cost for 400 LLMs.

replies(1): >>40719389 #

3. J_Shelby_J ◴[18 Jun 24 16:05 UTC] No.40719389[source]▶

>>40718266 #

> tiktoken.encoding_for_model(model)

Calling this where model == 'gpt-4o' will encode with CL200k no?

But yes, I do agree with you. I had time implementing non-tiktoken tokenizers for my project. I ended up manually adding tokenizer.json files into my repo.[1] The other options is downloading from HF, but the official repos where the model's tokenizer.json lives require agreeing to their terms to access. So it requires an HF key, and agreeing to the terms. So not a good experience for a consumer of the package.

> Message frame tokens?

Do you mean the chat template tokens? Oh, that's another good point. Yeah, it counts OpenAI prompt tokens, but you're right it doesn't count chat template tokens. So that's another source of inaccuracy. I solved this by implementing a Jinja templating engine to create the full prompt. [2] Granted, both llama.cpp and mistral-rs do this on the backend, so it's purely for counting tokens. I guess it would make sense to add a function to convert tokens to Dollars.

[1] https://github.com/ShelbyJenkins/llm_utils/tree/main/src/mod... [2] https://github.com/ShelbyJenkins/llm_utils/blob/main/src/pro...

replies(1): >>40723841 #

4. refulgentis ◴[19 Jun 24 01:24 UTC] No.40723841{3}[source]▶

>>40719389 #

>> tiktoken.encoding_for_model(model) > Calling this where model == 'gpt-4o' will encode with CL200k no?

No, it will never use O200K, I don't know how to word where its located without sounding aggro, apologies: read below, i.e. the rest of the method.

They copied demo code for Tiktoken with an allowlist without gpt-4o in it, because the demo code is from before 4o.

The demo code has an allowlist, that does string matching, and if its not one of 5 models, none of which are gpt-4o, it says "eh, if it starts gpt-4, just use gpt-4-0613, and make a recursive call"

You can't really blame them, because all they did was copy demo code from OpenAI from before gpt-4o, but I hope you get a giggle out of the extreme clown car this situation is. It's a really bad paper-thin out-of-date tiktoken wrapper that can only do c100k and claims support for 400 LLMs.

Really bonkers.

I know you gotta read the whole method to get it, but, people really shouldn't have just been like "my word! its mean to say they don't get it!" -- it's horrible.

https://github.com/AgentOps-AI/tokencost/blob/e1d52dbaa3ada2...