(qwenlm.github.io)

116 points cmcconomy | 4 comments | 18 Nov 24 16:27 UTC | HN request time: 0.664s | source

Show context

swazzy ◴[18 Nov 24 17:48 UTC] No.42174873[source]▶

Note unexpected three body problem spoilers in this page

1. johndough ◴[18 Nov 24 18:08 UTC] No.42175102[source]▶

And this example does not even illustrate the long context understanding well, since smaller Qwen2.5 models can already recall parts of the Three Body Problem trilogy without pasting the three books into the context window.

replies(2): >>42175250 #>>42176842 #

2. gs17 ◴[18 Nov 24 18:19 UTC] No.42175250[source]▶

>>42175102 (TP) #

And multiple summaries of each book (in multiple languages) are almost definitely in the training set. I'm more confused how it made such inaccurate, poorly structured summaries given that and the original text.

Although, I just tried with normal Qwen 2.5 72B and Coder 32B and they only did a little better.

3. agildehaus ◴[18 Nov 24 20:50 UTC] No.42176842[source]▶

>>42175102 (TP) #

Seems a very difficult problem to produce a response just on the text given and not past training. An LLM that can do that would seem to be quite more advanced than what we have today.

Though I would say humans would have difficulty too -- say, having read The Three Body problem before, then reading a slightly modified version (without being aware of the modifications), and having to recall specific details.

replies(1): >>42177309 #

4. botanical76 ◴[18 Nov 24 21:32 UTC] No.42177309[source]▶

>>42176842 #

This problem is poorly defined; what would it mean to produce a response JUST based on the text given? Should it also forgo all logic skills and intuition gained in training because it is not in the text given? Where in the N dimensional semantic space do we draw a line (or rather, a surface) between general, universal understanding and specific knowledge about the subject at hand?

That said, once you have defined what is required, I believe you will have solved the problem.

↑

Extending the context length to 1M tokens