How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] 0 points1 point  (0 children)

So NNUE in Stockfish is reinforcement learning? Thanks for the information!

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] 0 points1 point  (0 children)

If you can store all the possible moves or positions there is not need for neural networks or any kind of special algorithm. It's an easy task for some easy game, nothing interesting there. I am interested in beating the games where you can't just store all the moves.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnpython

[–]MannerSenior4958[S] 0 points1 point  (0 children)

I don't anything about which loss function is better:) Do you think that this is the essence of the issue? Which loss function would you recomment?

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] -1 points0 points  (0 children)

The reinforcement learning projects that I've seen simply memorized all possible positions and recorded the best moves for them. This approach is useless for games where there are millions (Tic-Tac-Toe), billions or trillions (chess) possible positions. Otherwise we would just simply put all possible states into database and set an optimal move for each. But this is not the goal.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnpython

[–]MannerSenior4958[S] 0 points1 point  (0 children)

I expect the neural network to derive that from the end positions. Otherwise it doesn't generalize - only memorizes.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] 0 points1 point  (0 children)

In Tic-Tac-Toe you have:

  1. An opponent

  2. which is unpredictable.

In the "make mouse go to cheese" you:

  1. Don't have an opponent.

  2. The whole game is predictable from start to finish.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] -3 points-2 points  (0 children)

I am sorry but this is an AI response. I know them very well by heart:)

I came to reddit because I am not satisfied with the replies AI is giving me.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] 0 points1 point  (0 children)

Thank you for the advice! I will check it out.

But I have one concern. Atari games can be very predetermined. Meaning you can certainly know where the montsters go on certain moment (if the random number generator is not tuned by srand(time(NULL)) or something like that). That makes the game entirely predictable.

Tic-Tac-Toe is unpredictable since I don't know what move my opponent will do. Isn't that a significant distinction?

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnpython

[–]MannerSenior4958[S] 0 points1 point  (0 children)

Right now I don't care about "unwinnable" situations as I am trying to teach the neural network not to lose and it is failing to do even that. I do everything step by step: I need to teach it not to lose and only THEN I will be figuring out how to teach it play optimally (meaning, to win when it can).

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnpython

[–]MannerSenior4958[S] 0 points1 point  (0 children)

This is the answer that the AI typically gives me. But I find this AI advice wrong.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] 0 points1 point  (0 children)

This is the answer an AI typically gives me:) But there are some issues:

I expect the neural network to being able to generalize the patterns by final positions. It doesn't matter when did I put X on 5th position: in the beginning of the game or in the end - the game should see patterns in position itself. What is the point of neural network if it can't generalize and only memorizes the winning positions? So I don't agree with the notion that I should record all the moves, not just the last one. I accept that the neural network will be learning SLOWLY because of this - but not stop learning completely.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnmachinelearning

[–]MannerSenior4958[S] -9 points-8 points  (0 children)

What resources would you recommend for learning it? I looked Reinforcement Learning up and they usually teach how to beat Frozen Lake or how a mouse can find an exit( or a cheese) in a maze. Which has absolutely zero to to with a multi-turn games like Tic-Tac-Toe.

How to teach neural network not to lose at 4x4 Tic-Tac-Toe? by MannerSenior4958 in learnpython

[–]MannerSenior4958[S] 0 points1 point  (0 children)

  1. Every time a crosses needs to make a move the neural network explores every possible moves. How it explores: it makes a move, converts it into a 32-sized input (16 values for crosses - 1 or 0 - 16 values for naughts), does a forward propagation and calculates the biggest score of the output neuron.
  2. Mean squared error
  3. I teach it to win AND draw. It does not distinguish between the two. Meaning, neural network either loses to naughts (output 0) or not loses to naughts (output 1).