Repo
https://github.com/s3603075/ai-a2/
The base project was forked off
https://github.com/s3603075/ai1901-connectfour
Minimax was used to develop an AI for connect 4. The minimax agent is an algorithm used to compute the most optimal move for a player by weighing its score against an opponent’s move, who will try to ‘minimise’ a player’s position and depending on the depth and power of the computing machine, will continue to look for the most optimal position alternating in min, max. In short, it tries to look into the future to play x amount of moves to calculate the most optimal position to take.
In my implementation of the connect four agent, the minimax function was used and adapted in conjunction with my heuristic function, which calculated the scores for a particular position in the future, and alpha beta pruning. The initial function, get_move(), initiates the minimax function by looping through the available valid moves and simulates a piece being placed into a position for analysis. The minimax function continues this until it reaches a certain depth, which then the score for the functions are evaluated by choosing the highest score (for a maximising player).Using the minimax function was somewhat adequate, but slowed increasingly even at a depth of one or two. Thus I used alpha-beta pruning to speed up the evaluation process of the minimax algorithm.
Alpha-beta pruning is used as an optimisation to the minimax algorithm, where some branches in a tree are discovered and each value is placed into either an alpha or beta value. These values are then compared to each other (beta <= alpha) and if true, will break from discovering more branches, due to already finding the minimum or maximum value. Implementing this change help speed up my agent drastically as evidently there were a lower amount of branches that needed to be discovered, which translates to fewer valid moves needing to be simulated.
The heuristic used to assist the minimax function is very simple: analyse for each row, column and diagonal positions the amount of spaces taken by either a player or opponent, in a group of four (or any other number to win). For example, a row contains seven slots ([1,2,3,4,5,6,7]) and the amount of spaces needed to win is four. The array with the seven slots is analysed for each four piece position. ([1,2,3,4], [2,3,4,5], [3,4,5,6]). This is done for each row on the board. This heuristic was adequate, except the agent in the first two moves would generally place the piece on the side position, which in connect four is a bad position as there are very few options for a connect four, only having a upwards diagonal, horizontal and vertical positions for a win. Thus a different weight would need to placed on more central positions.
Another aspect added to the heuristic was weighting towards more central positions. Initially the heuristic did work well against the other random and Monte-Carlo agents, however in connect four generally a winning position will comprise of a player gaining as many central positions as possible. It creates a higher chance of winning as there are multiple diagonal and horizontal winning positions, and if both players played perfectly, the first player to place a piece in the middle position is guaranteed a win.
In choosing the weights, the best position a player would be in (that is not a winning position) would be three in a row. Therefore the highest weighting was put on this condition first. The opposite, an opponent having a three in a row, a defensive piece would be the lowest weight since the opponent would be minimising the player’s move.
Overall, the agent plays relatively strongly. As minimax determines the next move in the future that the opponent or player may make, alpha beta pruning speeds up the process. The heuristic used is slow, but will analyse every position and give an appropriate weight that should help the minimax function make the correct decision.
Weaknesses:
The agent has a low depth, which could be caused by the machine that it was coded on (a very low power CPU) which means it cannot see as far ahead into the future as other agents, which would put it at a major disadvantage. This could have been improved by using other optimisation techniques to further improve the speed of which it could perform the heuristic, such as using a transposition table to cache moves, or implementing iterative deepening to increase the depth.
Possible improvements could be done to the heuristic function, as there tends to be many other rules and complexities associated with perfect games of connect 4. There are moves that are non-threatening or threatening, depending on the situation of the game that the heuristic cannot foresee and will attempt to place a piece in a non-optimal position.
A hash table can be used to speed up my heuristic function. In its current form it is quite slow, and would get slower the more rows and columns the game has. Each evaluation of a row or column has a big-O of x^2.