'World Models,' an old idea in AI, mount a comeback

(www.quantamagazine.org)

204 points warrenm | 1 comments | 02 Sep 25 16:53 UTC | HN request time: 0s | source

Show context

AnotherGoodName ◴[02 Sep 25 17:54 UTC] No.45106653[source]▶

I’ve been working on board game ai lately.

Fwiw nothing beats ‘implement the game logic in full (huge amounts of work) and with pruning on some heuristics look 50 moves ahead’. This is how chess engines work and how all good turn based game ai works.

I’ve tried throwing masses of game state data at latest models in pytorch. Unusable. It Makes really dumb moves. In fact one big issue is that it often suggests invalid moves and the best way to avoid this is to implement the board game logic in full to validate it. At which point, why don’t i just do the above scan ahead X moves since i have to do the hard parts of manually building the world model anyway?

One area where current ai is helping is on the heuristics themselves for evaluating best moves when scanning ahead. You can input various game states and whether the player won the game or not in the end to train the values of the heuristics. You still need to implement the world model and look ahead to use those heuristics though! When you hear of neural networks being used for go or chess this is where they are used. You still need to build the world model and brute force scan ahead.

One path i do want to try more: In theory coding assistants should be able to read rulebooks and dynamically generate code to represent those rules. If you can do that part the rest should be easy. Ie. it could be possible to throw rulebooks at ai and it play the game. It would generate a world model from the rulebook via coding assistants and scan ahead more moves than humanly possible using that world model, evaluating to some heuristics that would need to be trained through trial and error.

Of course coding assistants aren’t at a point where you can throw rulebooks at them to generate an internal representation of game states. I should know. I just spent weeks building the game model even with a coding assistant.

replies(12): >>45106842 #>>45106945 #>>45106986 #>>45107761 #>>45107771 #>>45108876 #>>45109332 #>>45109904 #>>45110225 #>>45112651 #>>45113553 #>>45114494 #

daxfohl ◴[02 Sep 25 19:20 UTC] No.45107771[source]▶

>>45106653 #

Yeah, I can't even get them to retain a simple state. I've tried having them run a maze, but instead of giving them the whole maze up front, I have them move one step at a time, tell them which directions are open from that square and ask for the next move, etc.

After a few moves they get hopelessly lost and just start wandering back and forth in a loop. Even when I prompt them explicitly to serialize a state representation of the maze after each step, and even if I prune the old context so they don't get tripped up on old state representations, they still get flustered and corrupt the state or lose track of things eventually.

They get the concept: if I explain the challenge and ask to write a program to solve such a maze step-by-step like that, they can do that successfully first-try! But maintaining it internally, they still seem to struggle.

replies(4): >>45108025 #>>45108185 #>>45111700 #>>45112643 #

warrenm ◴[02 Sep 25 19:39 UTC] No.45108025[source]▶

>>45107771 #

>I've tried having them run a maze, but instead of giving them the whole maze up front, I have them move one step at a time, tell them which directions are open from that square and ask for the next move, etc.

Presuming these are 'typical' mazes (like you find in a garden or local corn field in late fall), why not have the bot run the known-correct solving algorithm (or its mirror)?

replies(1): >>45108081 #

daxfohl ◴[02 Sep 25 19:44 UTC] No.45108081[source]▶

>>45108025 #

Like I said, they can implement the algorithm to solve it, but when forced to maintain the state themselves, either internally or explicitly in the context, they are unable to do so and get lost.

Similarly if you ask to write a Sudoku solver, they have no problem. And if you ask an online model to solve a sudoku, it'll write a sudoku solver in the background and use that to solve it. But (at least the last time I tried, a year ago), if you ask to solve step-by-step using pure reasoning without writing a program, they start spewing out all kinds of nonsense (but humorously cheat: they'll still spit out the correct answer at the end).

replies(4): >>45108698 #>>45111426 #>>45119593 #>>45127388 #

warrenm ◴[03 Sep 25 19:30 UTC] No.45119593[source]▶

>>45108081 #

you do not need to remember state with the simplest solver:

- place your right hand on the right wall - walk forward, never letting your hand leave the wall - arrive at the exit

yes, you travel many dead ends along the way

but you are guaranteed to get to the end of a 'traditional' maze

replies(1): >>45121923 #

daxfohl ◴[04 Sep 25 00:19 UTC] No.45121923{3}[source]▶

>>45119593 #

Yeah I did the type where you start somewhere inside the maze and have to find the "treasure". Mainly because it was slightly easier to implement, but also had the nice side effect of not being solvable by that rule alone.

FWIW the LLMs were definitely not following that rule. They seemed to always keep going straight whenever that was an option. Which meant they would always get stuck at T intersections when both ways led to a dead end.

replies(1): >>45128097 #

1. warrenm ◴[04 Sep 25 15:03 UTC] No.45128097{4}[source]▶

>>45121923 #

Starting in the middle, vs one end or the other, is definitely a different problem :)

↑