Something weird is happening with LLMs and chess

(dynomight.substack.com)

Show context

swiftcoder ◴[15 Nov 24 07:57 UTC] No.42144784[source]▶

I feel like the article neglects one obvious possibility: that OpenAI decided that chess was a benchmark worth "winning", special-cases chess within gpt-3.5-turbo-instruct, and then neglected to add that special-case to follow-up models since it wasn't generating sustained press coverage.

replies(8): >>42145306 #>>42145352 #>>42145619 #>>42145811 #>>42145883 #>>42146777 #>>42148148 #>>42151081 #

scott_w ◴[15 Nov 24 11:10 UTC] No.42145811[source]▶

>>42144784 #

I suspect the same thing. Rather than LLMs “learning to play chess,” they “learnt” to recognise a chess game and hand over instructions to a chess engine. If that’s the case, I don’t feel impressed at all.

replies(5): >>42146086 #>>42146152 #>>42146383 #>>42146415 #>>42156785 #

1. fires10 ◴[15 Nov 24 11:52 UTC] No.42146086[source]▶

>>42145811 #

Recognize and hand over to a specialist engine? That might be useful for AI. Maybe I am missing something.

replies(5): >>42146145 #>>42146293 #>>42146329 #>>42147558 #>>42151536 #

2. worewood ◴[15 Nov 24 12:04 UTC] No.42146145[source]▶

>>42146086 (TP) #

It's because this is standard practice since the early days - there's nothing newsworthy in this at all.

3. generic92034 ◴[15 Nov 24 12:30 UTC] No.42146293[source]▶

>>42146086 (TP) #

How do you think AI are (correctly) solving simple mathematical questions which they have not trained for directly? They hand it over to a specialist maths engine.

replies(1): >>42149781 #

4. nerdponx ◴[15 Nov 24 12:35 UTC] No.42146329[source]▶

>>42146086 (TP) #

It is and would be useful, but it would be quite a big lie to the public, but more importantly to paying customers, and even more importantly to investors.

replies(1): >>42148826 #

5. scott_w ◴[15 Nov 24 15:06 UTC] No.42147558[source]▶

>>42146086 (TP) #

If I was sold a general AI problem solving system, I’d feel ripped off if I learned that I needed to build my own problem solver and hook it up after I’d paid my money…

6. anon84873628 ◴[15 Nov 24 17:17 UTC] No.42148826[source]▶

>>42146329 #

The problem is simply that the company has not been open about how it works, so we're all just speculating here.

7. internetter ◴[15 Nov 24 19:00 UTC] No.42149781[source]▶

>>42146293 #

This is a relatively recent development (<3 months), at least for OpenAI, where the model will generate code to solve math and use the response

replies(1): >>42151065 #

8. cruffle_duffle ◴[15 Nov 24 21:07 UTC] No.42151065{3}[source]▶

>>42149781 #

They’ve been doing that a lot longer than three months. ChatGPT has been handing stuff off to python for a very long time. At least for my paid account anyway.

9. skydhash ◴[15 Nov 24 21:47 UTC] No.42151536[source]▶

>>42146086 (TP) #

Wasn't that the basis of computing and technology in general? Here is one tedious thing, let's have a specific tool that handles it instead of wasting time and efforts. The fact is that properly using the tool takes training and most of current AI marketing are hyping that you don't need that. Instead, hand over the problem to a GPT and it will "magically" solve it.

↑