0.0.2 • Published 9 years ago
no-thanks-predictors v0.0.2
No Thanks Predictors
Zack Tillotson
A set of optimizable predictors (aka players) for the game No Thanks!
npm install
npm test
Strategy for Reinforcement Learner
- LM101 Link
- Strategy
- Define state vector
- Create game simulator
- What are results of actions
- Who are the opponents
- Reenforcement signal
- Special feature => small when doing well and large when doing poorly
- eg
- RS = (us - best competitor) / (total score taken + 1) * 50 + 50
- 0-100 while playing
- 50 is tied with everyone
- 100 is losing by most possible
- 0 is winning by most possible
- 10x when game over and have lost
- 1/10x when game over and have won
- Develop control law
- Linear combination of features
- Prob(Action|State) = sum( Feature_n(state) * W(n) )
- Features
- Table
- Pot
- Card ?
- Redundant when player card value considered
- Cards left
- For each player
- Your money
- Your card total
- Card value to you
- Table
- Features
- Develop learning rule
- Adaptive gradient decent
- W'(n) = W(n) (Action - Prob(action)) Reenforcement signal
0.0.2
9 years ago