GitHub - hurshprasad/RL-easy21

A game like blackjack except with full replacement and no aces as 1/11's.

Reinforcement Learning approaches below.

Using GPI for Q optimzation, using time varying scalar step and ε-greedy exploration strategy.

Q^*(s,a) = Q(s,a) + α ζe_t(s,a)

Q(s, a) = Φ(s, a)^Τ θ

Using overlapping Coarse Coding for feature vector Φ overlapping state space with player sum and dealer initial value.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
conf		conf
src		src
utils		utils
.gitignore		.gitignore
MC_Optimal.png		MC_Optimal.png
README.md		README.md