←back to thread

625 points lukebennett | 1 comments | | HN request time: 0s | source
Show context
iandanforth ◴[] No.42139410[source]
A few important things to remember here:

The best engineering minds have been focused on scaling transformer pre and post training for the last three years because they had good reason to believe it would work, and it has up until now.

Progress has been measured against benchmarks which are / were largely solvable with scale.

There is another emerging paradigm which is still small(er) scale but showing remarkable results. That's full multi-modal training with embodied agents (aka robots). 1x, Figure, Physical Intelligence, Tesla are all making rapid progress on functionality which is definitely beyond frontier LLMs because it is distinctly different.

OpenAI/Google/Anthropic are not ignorant of this trend and are also reviving or investing in robots or robot-like research.

So while Orion and Claude 3.5 opus may not be another shocking giant leap forward, that does not mean that there arn't giant shocking leaps forward coming from slightly different directions.

replies(9): >>42139779 #>>42139984 #>>42140069 #>>42140194 #>>42140421 #>>42141563 #>>42142249 #>>42142983 #>>42143148 #
sincerecook ◴[] No.42140421[source]
> That's full multi-modal training with embodied agents (aka robots). 1x, Figure, Physical Intelligence, Tesla are all making rapid progress on functionality which is definitely beyond frontier LLMs because it is distinctly different.

Cool, but we already have robots doing this in 2d space (aka self driving cars) that struggle not to kill people. How is adding a third dimension going to help? People are just refusing to accept the fact that machine learning is not intelligence.

replies(4): >>42141572 #>>42141776 #>>42142802 #>>42143184 #
1. akomtu ◴[] No.42141776[source]
My understanding is that machine learning today is a lot like interpolation of examples in the dataset. The breakthrough of LLMs is due to the idea that interpolation in a 1024-dimensional space works much better than in a 2d space, if we naively interpolated English letters. All the modern transformers stuff is basically an advanced interpolation method that uses a large local neighborhood than just few nearest examples. It's like the Lanczos interpolation kernel, using a 1d analogy. Increasing the size of the kernel won't bring any gains, because the current kernel already nearly perfectly approximates an ideal interpolation (a full dataset DFT).

However interpolation isn't reasoning. If we want to understand the motion of planets, we would start with a dataset of (x, y, z, t) coordinates and try to derive the law of motion. Imagine if someone simply interpolated the dataset and presented the law of gravity as an array of million coefficients (aka weights)? Our minds have to work with a very small operating memory that can hardly fit 10 coefficients. This constraint forces us to develop intelligence that compacts the entire dataset into one small differential equation. Btw, English grammar is the differential equation of English in a lot of ways: it tells what the local rules are of valid trajectories of words that we call sentences.