←back to thread

688 points crescit_eundo | 2 comments | | HN request time: 0.001s | source
Show context
swiftcoder ◴[] No.42144784[source]
I feel like the article neglects one obvious possibility: that OpenAI decided that chess was a benchmark worth "winning", special-cases chess within gpt-3.5-turbo-instruct, and then neglected to add that special-case to follow-up models since it wasn't generating sustained press coverage.
replies(8): >>42145306 #>>42145352 #>>42145619 #>>42145811 #>>42145883 #>>42146777 #>>42148148 #>>42151081 #
dmurray ◴[] No.42145352[source]
This seems quite likely to me, but did they special case it by reinforcement training it into the LLM (which would be extremely interesting in how they did it and what its internal representation looks like) or is it just that when you make an API call to OpenAI, the machine on the other end is not just a zillion-parameter LLM but also runs an instance of Stockfish?
replies(1): >>42145408 #
shaky-carrousel ◴[] No.42145408[source]
That's easy to test, invent a new chess variant and see how the model does.
replies(3): >>42145466 #>>42145557 #>>42146160 #
andy_ppp ◴[] No.42145557[source]
You're imagining LLMs don't just regurgitate and recombine things they already know from things they have seen before. A new variant would not be in the dataset so would not be understood. In fact this is quite a good way to show LLMs are NOT thinking or understanding anything in the way we understand it.
replies(2): >>42145905 #>>42147218 #
shaky-carrousel ◴[] No.42145905[source]
Yes, that's how you can really tell if the model is doing real thinking and not recombinating things. If it can correctly play a novel game, then it's doing more than that.
replies(3): >>42146014 #>>42146022 #>>42146190 #
timdiggerm ◴[] No.42146190[source]
By that standard (and it is a good standard), none of these "AI" things are doing any thinking
replies(1): >>42147408 #
Jerrrrrrry ◴[] No.42147408[source]
musical goalposts, gotta love it.

These LLM's just exhibited agency.

Swallow your pride.

replies(1): >>42147976 #
1. samatman ◴[] No.42147976[source]
"Does it generalize past the training data" has been a pre-registered goalpost since before the attention transformer architecture came on the scene.
replies(1): >>42148394 #
2. Jerrrrrrry ◴[] No.42148394[source]

  >'thinking' vs 'just recombinating things
If there is a difference, and LLM's can do one but not the other...

  >By that standard (and it is a good standard), none of these "AI" things are doing any thinking

  >"Does it generalize past the training data" has been a pre-registered goalpost since before the attention transformer architecture came on the scene.

Then what the fuck are they doing.

Learning is thinking, reasoning, what have you.

Move goalposts, re-define words, it won't matter.