←back to thread

233 points JnBrymn | 1 comments | | HN request time: 0s | source
Show context
varispeed ◴[] No.45676118[source]
Text is linear, whereas image is parallel. I mean when people often read they don't scan text from left to right (or different direction, depending on language), but rather read the text all at once or non-linearly. Like first lock on keywords and then read adjacent words to get meaning, often even skipping some filler sentences unconsciously.

Sequential reading of text is very inefficient.

replies(4): >>45676232 #>>45676919 #>>45677443 #>>45677649 #
sosodev ◴[] No.45676232[source]
LLMs don't "read" text sequentially, right?
replies(1): >>45676349 #
olliepro ◴[] No.45676349[source]
The causal masking means future tokens don’t affect previous tokens embeddings as they evolve throughout the model, but all tokens a processed in parallel… so, yes and no. See this previous HN post (https://news.ycombinator.com/item?id=45644328) about how bidirectional encoders are similar to diffusion’s non-linear way of generating text. Vision transformers use bidirectional encoding b/c of the non-causal nature of image pixels.
replies(1): >>45676819 #
Merik ◴[] No.45676819{3}[source]
Didn’t anthropic show that the models engage in a form of planning such that it is predicting a possible future subsequent tokens that then affects prediction of the next token: https://transformer-circuits.pub/2025/attribution-graphs/bio...
replies(1): >>45677066 #
1. ACCount37 ◴[] No.45677066{4}[source]
Sure, an LLM can start "preparing" for token N+4 at token N. But that doesn't change that the token N can't "see" N+1.

Causality is enforced in LLMs - past tokens can affect future tokens, but not the other way around.