This project reimagines the classic Snake Game with Reinforcement Learning (RL). The snake is trained to autonomously navigate the grid, collect food, and avoid collisions using a variety of RL algorithms.
The implementation uses:
- Q-learning (initial baseline)
- Deep Q-Networks (DQN) for improved state generalisation
- Double DQN to address Q-value overestimation
- Hamiltonian Cycle strategy (with Primβs algorithm and BFS - Breath First Search fallback) to ensure safe traversal in complex board states
- Core Game: Built in Pygame with custom snake, food, and scoreboard logic.
- Reinforcement Learning:
- Q-learning (basic)
- DQN (neural network-based Q-learning)
- Double DQN (improves stability and performance)
- Fallback Strategies:
- BFS-based safe path search for mid-game navigation
- Hamiltonian cycle generation (Primβs algorithm) for long survival runs
- Saved Weights: Automatically loads existing
.pthor.npyweights for resuming training or running greedy play. - Custom UI:
GameOveroverlay with restart option. - Experimentation: Multiple play/test scripts (
Small Grid,Full Hamiltonian, etc.) to evaluate different strategies.
**pygame_snake_game**/
βββ **pygame_snake_game**
βββ **AI_Snake_Play**
βββ Full_Ham_implementation.py # Snake with Hamiltonian Cycle strategy
βββ main.py # Human Playable Game
βββ Play_game_Small_grid.py # Play Snake on a 400x400 grid with RL + fallback
βββ Play_game_Snake_AI.py # Purely Snake AI with Double-Deep Q Networks
βββ Play_game_Advanced_Hamiltonian_cycle.py # Play Snake on a 800x600 grid with RL + fallback
βββ **game_attributes**
βββ gameover.py # GameOver screen logic
βββ snake.py, food.py, scoreboard.py, snake_game.py # Core game logic
βββ **Hamiltonian_Implementation**
βββ ham_cycle.py
βββ Hamiltonian_cycle.py
βββ nan.py
βββ prims_algorithm.py
βββ **RL_agents_Training**
βββ Rainbow-RL.py # Original RL code with just Q-Learning.
βββ Rl_model.py # Pure Q-learning baseline
βββ RL_Agent_with_DDQN_CNN.py # Double-DQN agent implementation with a 7x7 or 5x5 Grid CNN
βββ RL_model_optimizing_Q.py # Optimisation for Q-learning
βββ SG_Double_DDQN_training.py # Double DQN training on smaller grid
βββ **WEIGHTS**
βββ old_q_table_files/ # Archived Q-tables
βββ Current_q_TABLE/ # Current training tables/weights
βββ RAINBOW_WEIGHTS/ # Training Weights for Rainbow Implementation
βββ weight_file_for_DQN/ # Saved DDQN model weights
βββ **pygame_snake_game_gg_colab** # This is a folder made specifically to train using google colab GPU for more complete algorithm-> work in progress
βββ **Turtle_snake-game** # Turtle-game Version
| Model / Strategy | Highest score | Highest Mean Score | Notes |
|---|---|---|---|
Current DQN WEIGHTS/snake_dqn.pth |
59 | N/A | Baseline DQN training run |
Current DQN WEIGHTS/Weight_wn_Reward_sys.pth |
67 | 1900 | Improved with reward shaping |
Current DQN WEIGHTS/Best_current_weight.pth.pth |
52 | 1500 | Considered best performing DQN so far |
CNN_weights(5x5).pth |
32 | 1000 | CNN filter size 5x5 |
CNN_weights(7x7).pth |
16 | 600 | CNN filter size 7x7 |
| Hamiltonian Cycle | β | N/A | Survives indefinitely once path is set |
| BFS β Breadth First Search | 150 | N/A | Reliable but not optimal |
| SG β 400Γ400 Grid | 54 | 1700 | Double DQN on small grid |
0822.mp4
- Ran for 16+ hours, struggled to generalise.
- Agent often looped near food, exploiting reward shaping.
- Neural net approximates Q-values for large state space.
- Much faster convergence compared to tabular Q-learning.
- Works well when State vector is large.
- Currently the main training algorithm.
- Fixes overestimation of Q-values common in vanilla DQN.
- Produces more stable and consistent snake behaviour.
- Implemented Primβs algorithm to generate Hamiltonian cycle.
- Added BFS safe path search as a mid-game shortcut.
- Uses cycle rotation to continue safe traversal when nearly full.
- Training and evaluating Double DQN agent.
- Integrating Hamiltonian fallback for robust play.
- Considering advanced algorithms (Rainbow DQN, etc.) for further performance boost.
Play_game_Small_grid.py # Play Snake on a 400x400 grid with RL + fallback
Clone the repo:
git clone https://github.com/kdavid001/snake_game.git
cd snake-game/pygame_snake_gameInstall dependencies
pip install pygame torch numpy
Run specific Setup
python Play_game_Small_grid.py # Double DQN on smaller grid
python Full_Ham_implementation.py # With Hamiltonian fallback
python Rl_model.py # Tabular Q-learning baseline
- Further optimise Double DQN hyperparameters.
- Try Rainbow DQN and other advanced RL algorithms.
- Try Rainbow and Double DQN by passing a grid instead using Covolutional Networks.
- Improve reward shaping to try and reduce looping behaviour.
- Enhance Pygame UI with visual overlays for agent decisions and cycle paths.
Please Feel Free to contribute by Opening an issue.
This project is licensed under the MIT License β see the LICENSE file for details.
Credits for sources of inspiration:
- William Hamilton (for Hamiltonian cycles): OpenStax link
- John Tapsell (Hamiltonian cycle method): John Tapsell's blog
- The YouTube video that inspired the Hamiltonian cycle idea: YouTube link
This project was created by David Ogunmola.
If you use this project in any way (including derivatives or distributions), please include visible credit to the author in your documentation, app interface, or any public display of the software.
Thank you for respecting the work!