- Project1 - Tic-Tac-Toe and Maze (DFS)
- Project2 - Generalized Tic-Tac-Toe (Minimax & Alpha-Beta Pruning)
- Project3 - N-Queens (Iterative Search Algorithm)
- Project4 - MDP / HMM
-
Implement a simple Tic Tac Toe game with a 3*3 grid. Indicate if the game is a lose, win, or draw.
-
Given a separate maze file in text format. In this maze, you can go up, down, left, or right. The maze is 81*81 binary matrix where each line in the file represents a row and its values are separated with a space. 1 indicates a block that you cannot pass. 0 indicates a clear space that you can pass from. Implement a program that reads this maze and takes any two points (start and end indices) as inputs and tells whether there is a path in this maze between such points.
Project2 - Generalized Tic-Tac-Toe (Minimax & Alpha-Beta Pruning)
A generalized Tic Tac Toe is an n*n board game where each player chooses one of the parts X or O, and then plays in an alternate order to place his choice on the board. A player wins when he places m parts of his choice in a consecutive order. The game may end in a draw when no one wins. Given m and n, the agent can play against another agent in an n*n board and tries to place m parts in a row to win.
Minimax with Alpha-Beta Pruning
We define an evaluation function based on the idea of counting winning windows, which definition can be found in this article. I made further improvements that only update the winning windows and board scores containing the current move.
Solves N-Queen in n*n grid. Start with a random board, with one queen in each column.
min-conflicts heuristic
Iterative Search Algorithm
while not solved,
- Variable selection: randomly select any conflicted queen variable
- Operators: move queen in column
- Value selection: min-conflicts heuristic
- Choose a row value that conflicts the fewest queens
- Goal test: no attacks
- Evaluation: c(n) = number of attacks
A maze-like problem
- The agent lives in a grid
- Walls block the agent’s path
Noisy movement: actions do not always go as planned
- 80% of the time, the action North takes the agent North (if there is no wall there)
- 10% of the time, North takes the agent West; 10% East
- If there is a wall in the direction the agent would have been taken, the agent stays put
The agent receives rewards each time step
- Small “living” reward each step (can be negative)
- Big rewards come at the end (good or bad)
Goal: maximize sum of rewards
Implement a stochastic grid world problem given in gridA.1.csv or gridA.2.csv using MDP with the below information. Indicate v* for all cells in the grid.
Discount factor = 0.9 and Living reward = -0.01.
In the given file: final states -> 1000 or -800 ; “0” -> empty cells ; “-“ -> walls
Noise | Direction |
---|---|
60% | N |
15% | E |
10% | W |
15% | S |
Project4(b) - Hidden Markov Models(HMM)
Consider a casino game, where the dealer is a three-sided die with labels 1, 2, and 3. The dealer has three loaded dice D1 D2 D3 . For each die Di, the probability of rolling the number i is 0.6, and the probability of each of the other two outcomes is 0.2. At each turn, the dealer must decide whether to: (1) keep the same die; (2) switch to a different die randomly; (3) end the game. He chooses (1) with probability ½ and each of the others with probability ¼. At the beginning the dealer chooses one of the three dice with equal probability.
- Give an HMM for this situation. Specify the alphabet, the states, the transition probabilities and the emission probabilities.
- Suppose that you observe the following sequence of die rolls: x. (x is given to you in the input file.)
- Find a sequence of states which best explains the sequence of rolls. What is the probability of this sequence?