Skip to content

Silidrone/ai-taggame

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Tag Game

A reinforcement learning project that trains an AI agent to evade a chasing opponent in a 2D tag game environment. The evader agent learns to survive as long as possible against various chaser strategies using Proximal Policy Optimization (PPO) with curriculum learning.

Overview

The project features:

  • PPO-based training using Stable Baselines3
  • Curriculum learning against multiple deterministic chaser policies
  • Parallel environment processing for efficient training
  • Comprehensive evaluation tools with visualization support

Requirements

Install dependencies:

pip install -r requirements.txt

Training

Run the training script from the project root:

python environments/taggame/train.py [OPTIONS]

Options

Option Description Default
--log-dir Output directory Auto-generated timestamp
--timesteps Total training timesteps 1,000,000
--n-envs Number of parallel environments 8

Examples

# Basic training
python environments/taggame/train.py

# Custom training run
python environments/taggame/train.py --timesteps 500000 --n-envs 16

# Specify output directory
python environments/taggame/train.py --log-dir data/taggame/my_experiment --timesteps 2000000

Output

Models are saved to data/taggame/train_<TIMESTAMP>/:

  • best_model.zip - Best performing model during training
  • final.zip - Final model after training completes
  • ppo_taggame_*_steps.zip - Checkpoints every 50k steps
  • tensorboard/ - TensorBoard logs for monitoring
  • training.log - Training log file

Monitoring

View training progress with TensorBoard:

tensorboard --logdir data/taggame/train_<TIMESTAMP>/tensorboard

Evaluation

Run the evaluation script:

python environments/taggame/evaluate.py [OPTIONS]

Options

Option Description Default
--model Path to model file Auto-finds best model
--policies Policy indices (e.g., "0,1,7") or "all" all
--episodes Episodes per policy 10
--max-steps Max steps per episode 1000
--render Enable visual rendering False
--fps FPS limit for rendering 60
--deterministic-evader Use rule-based evader instead of RL model False

Examples

# Evaluate against all chaser policies
python environments/taggame/evaluate.py --policies all --episodes 10

# Evaluate specific policies with rendering
python environments/taggame/evaluate.py --policies 0,2,5 --render --fps 60

# Evaluate a specific model
python environments/taggame/evaluate.py --model data/taggame/train_20251223/best_model.zip

# Compare against deterministic baseline
python environments/taggame/evaluate.py --deterministic-evader --policies all

Chaser Policies

The evader trains against these deterministic chaser strategies:

Index Policy Description
0 DirectChasePolicy Simple direct pursuit toward evader
1 InterceptChasePolicy Predicts and intercepts evader position
2 CornerCutPolicy Cuts off corners to trap evader
3 ZigzagChasePolicy Zigzag movement pattern
4 SpiralChasePolicy Spiral chase pattern
5 RandomWalkPolicy Random movement
6 AmbushPolicy Waits for opportunity to strike
7 ChaoticChasePolicy Unpredictable aggressive pursuit
8 HumanLikePolicy Human-like chase behavior

Configuration

Key settings are in environments/taggame/config.py:

Game Settings

WIDTH = 900           # Game window width
HEIGHT = 800          # Game window height
PLAYER_RADIUS = 10    # Player collision radius
MAX_VELOCITY = 100    # Maximum player velocity
MAX_EPISODE_STEPS = 1000  # Episode length limit

Policy Weights

Policy weights control the curriculum learning distribution. Higher weights mean the policy is used more frequently during training:

POLICY_WEIGHTS = [
    54.9,   # DirectChasePolicy
    11.8,   # InterceptChasePolicy
    54.4,   # CornerCutPolicy
    20.4,   # ZigzagChasePolicy
    5.0,    # SpiralChasePolicy
    72.2,   # RandomWalkPolicy
    7.9,    # AmbushPolicy
    42.7    # ChaoticChasePolicy
]

Project Structure

ai-taggame/
├── README.md
├── requirements.txt
├── rl.py                    # Custom DQN implementation (legacy)
├── mdp.py                   # Abstract MDP base class
├── util.py                  # Utility functions
└── environments/
    └── taggame/
        ├── config.py        # Configuration & hyperparameters
        ├── taggame.py       # Core game environment
        ├── gym_wrapper.py   # Gymnasium wrapper for PPO
        ├── train.py         # PPO training script
        ├── evaluate.py      # Evaluation script
        ├── tag_player.py    # Player entity class
        ├── static_info.py   # Helper classes
        └── deterministic_policies/
            ├── __init__.py
            ├── direct_chase.py
            ├── intercept_chase.py
            ├── corner_cut.py
            ├── zigzag_chase.py
            ├── spiral_chase.py
            ├── random_walk.py
            ├── ambush.py
            ├── chaotic_chase.py
            ├── human_like.py
            └── evader_policy.py

About

An AI agent that learns how to avoid a tagger in a 2D taggame.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages