Most active commenters
  • scott_w(5)

←back to thread

688 points crescit_eundo | 15 comments | | HN request time: 0.216s | source | bottom
Show context
swiftcoder ◴[] No.42144784[source]
I feel like the article neglects one obvious possibility: that OpenAI decided that chess was a benchmark worth "winning", special-cases chess within gpt-3.5-turbo-instruct, and then neglected to add that special-case to follow-up models since it wasn't generating sustained press coverage.
replies(8): >>42145306 #>>42145352 #>>42145619 #>>42145811 #>>42145883 #>>42146777 #>>42148148 #>>42151081 #
scott_w ◴[] No.42145811[source]
I suspect the same thing. Rather than LLMs “learning to play chess,” they “learnt” to recognise a chess game and hand over instructions to a chess engine. If that’s the case, I don’t feel impressed at all.
replies(5): >>42146086 #>>42146152 #>>42146383 #>>42146415 #>>42156785 #
1. Kiro ◴[] No.42146152[source]
That's something completely different than what the OP suggests and would be a scandal if true (i.e. gpt-3.5-turbo-instruct actually using something else behind the scenes).
replies(3): >>42146324 #>>42147204 #>>42151029 #
2. nerdponx ◴[] No.42146324[source]
Ironically it's probably a lot closer to what a super-human AGI would look like in practice, compared to just an LLM alone.
replies(2): >>42146675 #>>42149673 #
3. sanderjd ◴[] No.42146675[source]
Right. To me, this is the "agency" thing, that I still feel like is somewhat missing in contemporary AI, despite all the focus on "agents".

If I tell an "agent", whether human or artificial, to win at chess, it is a good decision for that agent to decide to delegate that task to a system that is good at chess. This would be obvious to a human agent, so presumably it should be obvious to an AI as well.

This isn't useful for AI researchers, I suppose, but it's more useful as a tool.

(This may all be a good thing, as giving AIs true agency seems scary.)

replies(1): >>42147515 #
4. empath75 ◴[] No.42147204[source]
The point of creating a service like this is for it to be useful, and if recognizing and handing off tasks to specialized agents isn't useful, i don't know what is.
replies(1): >>42147547 #
5. scott_w ◴[] No.42147515{3}[source]
If this was part of the offering: “we can recognise requests and delegate them to appropriate systems,” I’d understand and be somewhat impressed but the marketing hype is missing this out.

Most likely because they want people to think the system is better than it is for hype purposes.

I should temper my level of impressed with only if it’s doing this dynamically . Hardcoding recognition of chess moves isn’t exactly a difficult trick to pull given there’s like 3 standard formats…

replies(2): >>42148468 #>>42149134 #
6. scott_w ◴[] No.42147547[source]
If I was sold a product that can generically solve problems I’d feel a bit ripped off if I’m told after purchase that I need to build my own problem solver and way to recognise it…
replies(1): >>42151049 #
7. Kiro ◴[] No.42148468{4}[source]
You're speaking like it's confirmed. Do you have any proof?

Again, the comment you initially responded to was not talking about faking it by using a chess engine. You were the one introducing that theory.

replies(1): >>42150704 #
8. sanderjd ◴[] No.42149134{4}[source]
Fair!
9. dartos ◴[] No.42149673[source]
So… we’re at expert systems again?

That’s how the AI winter started last time.

replies(1): >>42157158 #
10. scott_w ◴[] No.42150704{5}[source]
No, I don’t have proof and I never suggested I did. Yes, it’s 100% hypothetical but I assumed everyone engaging with me understood that.
11. cruffle_duffle ◴[] No.42151029[source]
If they came out and said it, I don’t see the problem. LLM’s aren’t the solution for a wide range of problems. They are a new tool but not everything is a nail.

I mean it already hands off a wide range of tasks to python… this would be no different.

12. cruffle_duffle ◴[] No.42151049{3}[source]
But it already hands off plenty of stuff to things like python. How would this be any different.
replies(1): >>42154898 #
13. scott_w ◴[] No.42154898{4}[source]
If you mean “uses bin/python to run Python code it wrote” then that’s a bit different to “recognises chess moves and feeds them to Stockfish.”

If a human said they could code, you don’t expect them to somehow turn into a Python interpreter and execute it in their brain. If a human said they could play chess, I’d raise an eyebrow if they just played the moves Stockfish gave them against me.

14. kadoban ◴[] No.42157158{3}[source]
What is an "expert system" to you? In AI they're just series of if-then statements to encode certain rules. What non-trivial part of an LLM reaching out to a chess AI does that describe?
replies(1): >>42160230 #
15. dartos ◴[] No.42160230{4}[source]
The initial LLM acts as an intention detection mechanism switch.

To personify LLM way too much:

It sees that a prompt of some kind wants to play chess.

Knowing this it looks at the bag of “tools” and sees a chess tool. It then generates a response which eventually causes a call to a chess AI (or just chess program, potentially) which does further processing.

The first LLM acts as a ton of if-then statements, but automatically generated (or brute-forcly discovered) through training.

You still needed discrete parts for this system. Some communication protocol, an intent detection step, a chess execution step, etc…

I don’t see how that differs from a classic expert system other than the if statement is handled by a statistical model.