Also human can reason, LLMs currently can't do this in useful way and is very limited by their context in all the trials to make it do that. Not to mention their ability to make new things if they do not exist (and not complete made up stuff that are non-sense) is very limited.
You're basically ignoring all the experts saying "LLMs suck at all these things that even beginning domain experts don't suck at" to generate your claim & then ignoring all evidence to the contrary.
And you're ignoring the ways in which LLMs fall on their face to be creative that aren't language-based. Creative problem solving in ways they haven't been trained on is out of their domain while fully squarely in the domain of human intelligence.
> You can claim that that's not intelligence until the cows come home, but any person able to do that would be considered a savant
Computers can do arithmetic really quickly but that's not intelligence but a person computing that quickly is considered a savant. You've built up an erroneous dichotomy in your head.
1. The vast majority of people never come up with a truly new idea. those that do are considered exceptional and their names go down in history books.
2. Most 'new ideas' are rehashes of old ones.
3. If you set the temperature up on an LLM, it will absolutely come up with new ideas. Expecting an LLM to make a scientific discover a la einstein is ... a bit much, don't you think [1]? When it comes to 'everyday' creativity, such as short poems, songs, recipes, vacation itineraries, etc. ChatGPT is more capable than the vast majority of people. Literally, ask ChatGPT to write you a song about _____, and it will come up with something creative. Ask it for a recipe with ridiculous ingredients and see what it does. It'll make things you've never seen before, generate an image for you and even come up with a neologism if you ask it too. It's insanely creative.
[1] Although I have walked chatgpt through various theoretical physics scenarios and it will create new math for you.
Sure, for any domain expert, you can easily get an LLM to trip on something. But just the shear amount of things it is above average at puts it easily into the top echelon of humans.
> You're basically ignoring all the experts saying "LLMs suck at all these things that even beginning domain experts don't suck at" to generate your claim & then ignoring all evidence to the contrary.
Domain expertise is not the only form of intelligence. The most interesting things often lie at the intersections of domains. As I said in another comment. There are a variety of ways to judge intillegence, and no one quantifiable metric. It's like asking if Einstein is better than Mozart. I don't know... their fields are so different. However, I think it's pretty safe to say that the modern slate of LLMs fall into the top 10% of human intelligence, simply for their breath of knowledge and ability to synthesize ideas at the cross-section of any wide number of fields.
But they're not. The people who are extremely competent at many fields will still outperform LLMs in those fields. The LLM can basically only outperform a complete beginner in the area & makes up for that weakness by scaling up the amount it can output which a human can't match. That doesn't take away from the fact that the output is complete garbage when given anything it doesn't know the answer to. As I noted elsewhere, ask it to provide an implementation of the S3 ListObjects operation (like the actual backend) and see what BS it tries to output to the point where you have to spend a good amount of time to convince it just to not output an example of using the S3 ListObjects API.
> I think it's pretty safe to say that the modern slate of LLMs fall into the top 10% of human intelligence, simply for their breath of knowledge and ability to synthesize ideas at the cross-section of any wide number of fields.
Again, evidence assumed that's not been submitted. Please provide an indication of any truly novel ideas being synthesized by LLMs that are a cross-section of fields.
The problem here is that you expect something akin to relativity, the Poincare conjecture, et al. The vast majority of humans are not able to do this.
If you restrict yourself to the sorts of creativity that average people are good at, the models do extremely well.
I'm not sure how to convince you of this. Ideally, I'd get a few people of above average intelligence together, and give them an hour (?) to work on some problem / creative endeavor (we'd have to restrict their tool use to the equivalent of whatever we allow GPT to have), and then we can compare the results.
EDIT: Here's what ChatGPT thinks we should do: https://chatgpt.com/share/673b90ca-8dd4-8010-a1a0-61af699a44...
I want to be clear - I'm talking about the intelligence of AI systems available today and today only. There's lots of reason to be enthusiastic about the future but similarly very cautious about understanding what is available today & what is available today isn't human-like.
Depends on your definition of "truly" new since any idea could be argued to be a mix of all past ideas. But I see truly new ideas all the time without going down in the history books because most new ideas are incrementally building on what came before or are extremely niche and only a very few turn out to be a massive turning point which has a broad impact which is also only usually evident in retrospect (e.g. blue LEDs was basically trial and error and almost an approach that was given up on, transistors were believed to be impactful but not a huge revolution for computing like they turned out to be, etc etc).
My personal feeling when I engage in these conversations is that we humans have a cognitive bias to ascribe a human remixing of an old idea to intelligence, but an AI-model remixing of an old idea as lookup.
Indeed, basically every revolutionary idea is a mix of past ideas if you look closely enough. AI is a great example. To the 'lay person' AI is novel! It's new. It can talk to you! It's amazing. But for people who've been in this field for a while, it's an incremental improvement over linear algebra, topology, functional spaces, etc.
I don’t need to finetune on five hundred pictures of rabbits to know one. I need one look and then I’ll know for life and can use this in unimaginable and endless variety.
This is a simplistic example which you can naturally pick apart but when you do I’ll provide another such example. My point is, learning at human (or even animal) speeds is definitely not solved and I’d say we are not even attempting that kind of learning yet. There is “in context learning” and “finetuning” and both are not going to result in human level intelligence judging from anything I’ve had access to.
I think you are anthropomorphizing the clever text randomization process. There is a bunch of information being garbled and returned in a semi-legible fashion and you imbue the process behind it with intelligence that I don’t think it has. All these models stumble over simple reasoning unless specifically trained for those specific types of problems. Planning is one particularly famous example.
Time will tell, but I’m not betting on LLMs. I think other forms of AI are needed. Ones that understand substance, modality, time and space and have working memory, not just the illusion of it.
This is a common fallacy. The average human ingests a few dozen GB of data a day [1] [2].
ChatGPT 4 was trained on 13 trillion tokens. Say a token is 4 bytes (it's more like 3, but we're being conservative). That's 52 trillion bytes or 52 terabytes.
Say the average human only consumes the lower estimate of 30 GB a day. That means it would take a human 1625 days to consume the number of tokens ChatGPT was trained on, or 4.5 years. Assuming humans and the LLM start from the same spot [3], the proper question is... is ChatGPT smarter than a 4.5 year old. If we use the higher estimate, then we have to ask if ChatGPT is smarter than a 2 year old. Does ChatGPT hallucinate more or less than the average toddler?
The cognitive bias I've seen everywhere is the idea that humans are trained on a small amount of data. Nothing is further from the truth. Humans require training on an insanely large amount of data. A 40 year old human has been trained on orders of magnitudes more data than I think we even have available as data sets. If you prevent a human from being trained on this amount of data through sensory deprivation they go crazy (and hallucinate very vividly too!).
No argument about energy, but this is a technology problem.
[1] https://www.tech21century.com/the-human-brain-is-loaded-dail...
[2] https://kids.frontiersin.org/articles/10.3389/frym.2017.0002...
[3] this is a bad assumption since LLMs are randomly initialized whereas humans seem to be born with some biases that significantly aid in the acquisition of language and social skills
So if you do use in-context learning and give chatGPT a few images of your novel class, then it will correctly classify usually. Finetuning is so you an save on token cost.
Moreover, you don't typically need that many pictures to fine tune. The studies show that the models successfully extrapolate once they've been 'pre-trained'. This is similar to how my toddler insists that a kangaroo is a dog. She's not been exposed to enough data to know otherwise. Dog is a much more fluid category for her than in real life. If you talk with her for a while about it, she will eventually figure out kangaroo is kangaroo and dog is dog. But if you ask her again next week, she'll go back to saying they're dogs. Eventually she'll learn.
> All these models stumble over simple reasoning unless specifically trained for those specific types of problems. Planning is one particularly famous example.
We have extremely expensive programs called schools and universities designed to teach little humans how to plan and execute. If you look at cultures without American/Western biases (and there's not very many left, so we really have to look to history), we see that the idea of planning the way we do it is not universal.
A student consumes only ~6 hours of relevant material a day on various in textual form (textbooks) with minimal guidance from a domain expert and some guidance from peers.
Have you read the studies backing your links? The methodology for how they come up with that estimate is highly questionable especially on its own let alone when it comes to comparing with LLMs. Domain experts in the field are pretty confident that LLMs are trained on more actual information than humans.
> If you prevent a human from being trained on this amount of data through sensory deprivation they go crazy (and hallucinate very vividly too!).
People who are deaf & blind experience a significant amount of sensory deprivation compared with the typical human but do not go crazy or start hallucinating. This suggests that your analysis is flawed. For humans communication is the important bit - as long as we have some kind of communication mechanism we can achieve quite a fair bit.
How many LLMs have created companies entirely on their own? Or do anything unprompted, for that matter? You can go on about it but the fact that they require human interaction means the intelligence comes from the human using them, not the LLM itself. Tools are not intelligent.