Q-Learning_BOMBERMAN

Using Reinforcment Q-learning to teach an AI agent how to play from scratch a simple BomberMan game clone

This is a simple Q-learning algorythm in python to teach an AI how to play a generic bomberman clone.

The bomberman enviroment and its render were created in pure python.

The sprites were created by the user VOXEL and released at http://ludumdare.com/compo/2015/05/05/minild-59-swapshop/ gamejam.

Enviroment Rules.

A bomb takes 3 turns to explode.
A bomb always explode in a cross shaped area (x+1, x-1, y+1 and y-1)
A bomb destroy wall blocks.
A bomb kill the Agent case it is in the area.
The max number of turns to finish the game is 100.

Q-learning function

(source: https://blog.goodaudience.com/deep-q-learning-a-reinforcement-learning-algorithm-d1a93b754535)

REWARD function

 def get_reward(self):
    if (self.value_after) > 0:
        r =  (self.value_after**2) * (1 - (self.Turn / float(100) )) 
    else:    
        r = -1 * (self.Turn / float(100))
    return r

self.value_after: Is a delta generated using the number of blocks reamaning in the scenario before and after the agent perform a given action.
self.Turn: Current turn.
float(100): Total Number of turns.

The reward is bigger case the agent destroy more blocks in a early game phase. (This reward function forces the Agent to focused in destroy blocks, instead to spams bombs randomly in empty spaces)

If the agent doesn't destroy any block it will be penalized (negative reward) in a time manner (Not destroying blocks will become more expensive each turn).

Result after 0 episodes .

Result after 20000 episodes .

Result after 980000 episodes (The convergence was a lot early but I posted the last episode).

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
0_episode__animated.gif		0_episode__animated.gif
1_fpRuA-X7wGchI1I5gL19uA.png		1_fpRuA-X7wGchI1I5gL19uA.png
20000_episode__animated.gif		20000_episode__animated.gif
980000_episode__animated.gif		980000_episode__animated.gif
B2.png		B2.png
Q_LEARNING.ipynb		Q_LEARNING.ipynb
README.md		README.md
generate_gif.ipynb		generate_gif.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q-Learning_BOMBERMAN

Enviroment Rules.

Q-learning function

REWARD function

About

Releases

Packages

Languages

LucasSilvaFerreira/BOMBERMAN-Reinforcement-Learning-Q-Learning

Folders and files

Latest commit

History

Repository files navigation

Q-Learning_BOMBERMAN

Enviroment Rules.

Q-learning function

REWARD function

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages