If I tell an "agent", whether human or artificial, to win at chess, it is a good decision for that agent to decide to delegate that task to a system that is good at chess. This would be obvious to a human agent, so presumably it should be obvious to an AI as well.
This isn't useful for AI researchers, I suppose, but it's more useful as a tool.
(This may all be a good thing, as giving AIs true agency seems scary.)
Most likely because they want people to think the system is better than it is for hype purposes.
I should temper my level of impressed with only if it’s doing this dynamically . Hardcoding recognition of chess moves isn’t exactly a difficult trick to pull given there’s like 3 standard formats…
I mean it already hands off a wide range of tasks to python… this would be no different.
If a human said they could code, you don’t expect them to somehow turn into a Python interpreter and execute it in their brain. If a human said they could play chess, I’d raise an eyebrow if they just played the moves Stockfish gave them against me.
To personify LLM way too much:
It sees that a prompt of some kind wants to play chess.
Knowing this it looks at the bag of “tools” and sees a chess tool. It then generates a response which eventually causes a call to a chess AI (or just chess program, potentially) which does further processing.
The first LLM acts as a ton of if-then statements, but automatically generated (or brute-forcly discovered) through training.
You still needed discrete parts for this system. Some communication protocol, an intent detection step, a chess execution step, etc…
I don’t see how that differs from a classic expert system other than the if statement is handled by a statistical model.