(qwenlm.github.io)

116 points cmcconomy | 3 comments | 18 Nov 24 16:27 UTC | HN request time: 0.23s | source

1. lr1970 ◴[19 Nov 24 00:41 UTC] No.42178968[source]▶

> We have extended the model’s context length from 128k to 1M, which is approximately 1 million English words

Actually English language tokenizers map on average 3 words into 4 tokens. Hence 1M tokens is about 750K English words not a million as claimed.

2. ◴[19 Nov 24 01:05 UTC] No.42179102[source]▶

3. swyx ◴[19 Nov 24 01:31 UTC] No.42179262[source]▶

good, its been hours since i saw a "well actually" comment on HN

Extending the context length to 1M tokens