←back to thread

129 points jxmorris12 | 1 comments | | HN request time: 0.208s | source
1. emmanueloga_ ◴[] No.43133025[source]
Lecun's thesis: "if we generate outputs that are too long, the per-token error will compound to inevitable failure".

> The finding that language models can get better by generating longer outputs directly contradicts Yann’s hypothesis.

The author's examples show that the error has been minimized for a few examples of a certain length. This doesn't contradict Lecun, afaict.