(qwenlm.github.io)

116 points cmcconomy | 1 comments | 18 Nov 24 16:27 UTC | HN request time: 0.42s | source

Show context

swazzy ◴[18 Nov 24 17:48 UTC] No.42174873[source]▶

Note unexpected three body problem spoilers in this page

johndough ◴[18 Nov 24 18:08 UTC] No.42175102[source]▶

And this example does not even illustrate the long context understanding well, since smaller Qwen2.5 models can already recall parts of the Three Body Problem trilogy without pasting the three books into the context window.

replies(2): >>42175250 #>>42176842 #

1. gs17 ◴[18 Nov 24 18:19 UTC] No.42175250[source]▶

>>42175102 #

And multiple summaries of each book (in multiple languages) are almost definitely in the training set. I'm more confused how it made such inaccurate, poorly structured summaries given that and the original text.

Although, I just tried with normal Qwen 2.5 72B and Coder 32B and they only did a little better.

↑

Extending the context length to 1M tokens