←back to thread

688 points crescit_eundo | 2 comments | | HN request time: 0s | source
Show context
swiftcoder ◴[] No.42144784[source]
I feel like the article neglects one obvious possibility: that OpenAI decided that chess was a benchmark worth "winning", special-cases chess within gpt-3.5-turbo-instruct, and then neglected to add that special-case to follow-up models since it wasn't generating sustained press coverage.
replies(8): >>42145306 #>>42145352 #>>42145619 #>>42145811 #>>42145883 #>>42146777 #>>42148148 #>>42151081 #
scott_w ◴[] No.42145811[source]
I suspect the same thing. Rather than LLMs “learning to play chess,” they “learnt” to recognise a chess game and hand over instructions to a chess engine. If that’s the case, I don’t feel impressed at all.
replies(5): >>42146086 #>>42146152 #>>42146383 #>>42146415 #>>42156785 #
Kiro ◴[] No.42146152[source]
That's something completely different than what the OP suggests and would be a scandal if true (i.e. gpt-3.5-turbo-instruct actually using something else behind the scenes).
replies(3): >>42146324 #>>42147204 #>>42151029 #
empath75 ◴[] No.42147204[source]
The point of creating a service like this is for it to be useful, and if recognizing and handing off tasks to specialized agents isn't useful, i don't know what is.
replies(1): >>42147547 #
scott_w ◴[] No.42147547[source]
If I was sold a product that can generically solve problems I’d feel a bit ripped off if I’m told after purchase that I need to build my own problem solver and way to recognise it…
replies(1): >>42151049 #
1. cruffle_duffle ◴[] No.42151049[source]
But it already hands off plenty of stuff to things like python. How would this be any different.
replies(1): >>42154898 #
2. scott_w ◴[] No.42154898[source]
If you mean “uses bin/python to run Python code it wrote” then that’s a bit different to “recognises chess moves and feeds them to Stockfish.”

If a human said they could code, you don’t expect them to somehow turn into a Python interpreter and execute it in their brain. If a human said they could play chess, I’d raise an eyebrow if they just played the moves Stockfish gave them against me.