Skip to content

adeeshbhargava/MARL-Poker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MARL-Poker

This work explores strategies to adapt single-player reinforcement learning algorithms for competitive play in two-player adversarial games such as poker. To effectively learn to play against a competitive opponent in the absence of an expert, we experiment with different strategies such as training against a random policy, adversarial training and self-play. We conduct extensive experimentation to test effectiveness of each strategy and summarize our insights into training agents for optimal performance in competitive multiplayer environments.

Tournament:

Player 1 Player 2 Strategy Win-rate Reward (1000 Games) Winner
DQN_baseline DQN_Independent Cross Competition 26%:71%:1% 1329.5 DQN_Independent
DQN_Self_Play_classic DQN_Self_Play_improved Cross Competition 40%:58%:2% 489 DQN_Self_Play_2
DQN_Independent DQN_shared Cross Competition 55%:41%:4% 464.5 DQN_Independent
DQN_Self_Play_2 DQN_Independent Cross Competition 58%:40%:2% 462 DQN_Self_Play_2

Final Results

Training:

Player 1 Player 2 Strategy Win-rate Reward (1000 Games) Winner
DQN Agent Random Agent DQN_Baseline 81%:19%:0% 1191.5 DQN_baseline
DQN Agent DQN Agent Independent Learning 51%:47%:2% 141.5 DQN_Independent
DQN Agent DQN Agent Shared Learning 49%:49%:2% 101 DQN_shared
DQN Agent DQN Agent Self Play classic 50%:48%:2% 64.5 DQN_self_play_1
DQN Agent DQN Agent Self Play improved 51%:47%:3% 157.5 DQN_self_play_2

Visualizations: Texas Holdem Poker

Tournament

DQN Self Play Improved vs DQN_Independent (Finals)

| |

DQN Independent vs DQN shared

| |

DQN Self Play Classic vs DQN Self Play Improved

| |

DQN baseline vs DQN Independent

| |

Training

Tensorboard Plots: Leduc

Tensorboard Plots: Texas Hold'em

DQN vs Random

DQN vs DQN (Independent)

DQN vs DQN (Shared Policy)

DQN vs DQN (Self Play)

DQN vs DQN (Improved Self Play)

DQN Agent vs Random Agent Visualistion on Texas Hold'em Poker

DQN_v_baseline_Texas.mp4

Installation:

PettingZoo[classic,butterfly]>=1.24.0

Pillow>=9.4.0

ray[rllib]==2.7.0

SuperSuit>=3.9.0

torch>=1.13.1

tensorflow-probability>=0.19.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Languages