At their core, the state of the art LLMs can basically do any small to medium mental task better than I can or get so close to my level than I’ve found myself no longer thinking through things the long way. For example, if I want to run some napkin math on something, like I recently did some solar battery charge time estimates, an LLM can get to a plausible answer in seconds that would have taken me an hour.
So yeah, in many practical ways, LLMs are smarter than most people in most situations. They have not yet far surpassed all humans in all situations, and there are still some classes of reasoning problems that they seem to struggle with, but to a first order approximation, we do seem to be mostly there.
I think this is it. LLM responses feel like the unconsidered ideas that pop into my head from nowhere. Like if someone asks me how many states are in the United States, a number pops out from somewhere. I don't just wire that to my mouth, I also think about whether or not that's current info, have I gotten this wrong in the past, how confident am I in it, what is the cost of me providing bad information, etc etc etc.
If you effectively added all of those layers to an LLM (something that I think the o1-preview and other approaches are starting to do) it's going to be interesting to see what the net capability is.
The other thing that makes me feel like we're 'getting there' is using some of the fast models at groq.com. The information is generated at, in many cases, an order of magnitude faster than I can consume it. The idea that models might be able to start to engage through an much more sophisticated embedding than english to pass concepts and sequences back and forth natively is intriguing.
You have to look at the LLM as the inner voice in your head. We've kind of forced them into saying whatever they think due to how we sample the output (next token prediction), but in new architectures with pause tokens, we let them 'think' and they show better judgement and ability. These systems are rapidly going to improve and it will be very interesting to see.
But this is another reason why I think they've surpassed human intelligence. You have to look at each token as a 'time step' in the inner thought process of some entity. A real 'alive' entity has more 'ticks' than what their actions would suggest. For example, human brains can process up to 10FPS (100ms response time), but most humans aren't saying 10 words a second. However, we've made LLMs whose internal processes (i.e., their intuition) is already superior. If we just gave them that final agentic ability to not say anything and ponder (which researchers are doing), their capabilities will increase exponentially
> The other thing that makes me feel like we're 'getting there' is using some of the fast models at groq.com.
Unlike perhaps many of the commentators here, I've been in this field for a bit under a decade now, and was one of the early compiler engineers at Groq. Glad you're finding it useful. It's amazing stuff.