Skip to content

A Python toolkit used to train reinforcement learning algorithms against arcade games

License

Notifications You must be signed in to change notification settings

gnbk/MAMEToolkit

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The MAME RL Algorithm Training Toolkit

About

This Python library has the to potential to train your reinforcement learning algorithm on almost any arcade game. It is currently available on Linux systems and works as a wrapper around MAME. The toolkit allows your algorithm to step through gameplay while recieving the frame data and internal memory address values for tracking the games state, along with sending actions to interact with the game.

Requirements:

  • Operating system: Linux
  • Python version: 3.6+

Installation

You can use pip to install the library, just run:

pip install MAMEToolkit

DISCLAIMER: We are unable to provide you with any game ROMs. It is the users own legal responsibility to acquire a game ROM for emulation. This library should only be used for non-commercial research purposes.

There are some free ROMs available at: [https://www.mamedev.org/roms/]

Street Fighter Random Agent Demo

The toolkit has currently been applied to Street Fighter III Third Strike: Fight for the Future (Japan 990608, NO CD), but can modified for any game available on MAME. The following demonstrates how a random agent can be written for a street fighter environment.

import random
from MAMEToolkit.sf_environment import Environment

roms_path = "roms/"  # Replace this with the path to your ROMs
env = Environment("env1", roms_path)
env.start()
while True:
    move_action = random.randint(0, 8)
    attack_action = random.randint(0, 9)
    frames, reward, round_done, stage_done, game_done = env.step(move_action, attack_action)
    if game_done:
        env.new_game()
    elif stage_done:
        env.next_stage()
    elif round_done:
        env.next_round()

The toolkit also supports hogwild training:

from multiprocessing import Process
import random
from MAMEToolkit.sf_environment import Environment


def run_env(worker_id, roms_path):
    env = Environment(f"env{worker_id}", roms_path)
    env.start()
    while True:
        move_action = random.randint(0, 8)
        attack_action = random.randint(0, 9)
        frames, reward, round_done, stage_done, game_done = env.step(move_action, attack_action)
        if game_done:
            env.new_game()
        elif stage_done:
            env.next_stage()
        elif round_done:
            env.next_round()


workers = 8
# Environments must be created outside of the threads
roms_path = "roms/"  # Replace this with the path to your ROMs
threads = [Process(target=run_env, args=(i, roms_path)) for i in range(workers)]
[thread.start() for thread in threads]

Setting Up Your Own Game Environment

Game ID's
To create an emulation of the game you must first have the ROM for the game you are emulating and know the game ID used by MAME, for example for this version of street fighter it is 'sfiii3n'. The id of your game can be found by running:

from MAMEToolkit.emulator import see_games
see_games()

This will bring up the MAME emulator. You can search through the list of games to find the one you want. The id of the game is always in brackets at the end of the game title.

Memory Addresses
It doesn't take much to interact with the emulator itself using the toolkit, however the challenge comes from finding the memory address values associated with the internal state you care about, and tracking said state with your environment class. The internal memory states of a game can be tracked using the MAME Cheat Debugger, which allows you to track how the memory address values of the game change over time.

The cheat debugger can be run using the following:

from MAMEToolkit.emulator import run_cheat_debugger
roms_path = "roms/" # Replace this with the path to your ROMs
game_id = "sfiii3n"
run_cheat_debugger(roms_path, game_id)

For information about using the debugger, see the Memory dump section of this tutorial [https://www.dorkbotpdx.org/blog/skinny/use_mames_debugger_to_reverse_engineer_and_extend_old_games]

Once you have determined the memory addresses you wish to track you can start the emulation using:

from MAMEToolkit.emulator import Emulator
from MAMEToolkit.emulator import Address

roms_path = "roms/"  # Replace this with the path to your ROMs
game_id = "sfiii3n"
memory_addresses = {
        "fighting": Address('0x0200EE44', 'u8'),
        "winsP1": Address('0x02011383', 'u8'),
        "winsP2": Address('0x02011385', 'u8'),
        "healthP1": Address('0x02068D0B', 's8'),
        "healthP2": Address('0x020691A3', 's8')
    }
    
emulator = Emulator("env1", roms_path, "sfiii3n", memory_addresses)

This will immediately start the emulation and halt it when the toolkit has linked to the emulator process.

Stepping the emulator
Once the toolkit is linked, you can step the emulator along using the step function:

data = emulator.step([])

frame = data["frame"]
is_fighting = data["fighting"]
player1_wins = data["winsP1"]
player2_wins = data["winsP2"]
player1_health = data["healthP1"]
player2_health = data["healthP2"]

The step function returns the frame data as a NumPy matrix, along with all of the memory address integer values from that timestep.

Sending inputs To send actions to the emulator you also need to determine which input ports and fields the game supports. For example, with street fighter to insert a coin the following code is required:

from MAMEToolkit.emulator import Action

insert_coin = Action(':INPUTS', 'Coin 1')
data = emulator.step([insert_coin])

To identify which ports are availble use the list actions command:

from MAMEToolkit.emulator import list_actions

roms_path = "roms/"  # Replace this with the path to your ROMs
game_id = "sfiii3n"
print(list_actions(roms_path, game_id))

which for street fighter returns the list with all the ports and fields available for sending actions to the step function:

[
    {'port': ':scsi:1:cdrom:SCSI_ID', 'field': 'SCSI ID'}, 
    {'port': ':INPUTS', 'field': 'P2 Jab Punch'}, 
    {'port': ':INPUTS', 'field': 'P1 Left'}, 
    {'port': ':INPUTS', 'field': 'P2 Fierce Punch'}, 
    {'port': ':INPUTS', 'field': 'P1 Down'}, 
    {'port': ':INPUTS', 'field': 'P2 Down'}, 
    {'port': ':INPUTS', 'field': 'P2 Roundhouse Kick'}, 
    {'port': ':INPUTS', 'field': 'P2 Strong Punch'}, 
    {'port': ':INPUTS', 'field': 'P1 Strong Punch'}, 
    {'port': ':INPUTS', 'field': '2 Players Start'}, 
    {'port': ':INPUTS', 'field': 'Coin 1'}, 
    {'port': ':INPUTS', 'field': '1 Player Start'}, 
    {'port': ':INPUTS', 'field': 'P2 Right'}, 
    {'port': ':INPUTS', 'field': 'Service 1'}, 
    {'port': ':INPUTS', 'field': 'Coin 2'}, 
    {'port': ':INPUTS', 'field': 'P1 Jab Punch'}, 
    {'port': ':INPUTS', 'field': 'P2 Up'}, 
    {'port': ':INPUTS', 'field': 'P1 Up'}, 
    {'port': ':INPUTS', 'field': 'P1 Right'}, 
    {'port': ':INPUTS', 'field': 'Service Mode'}, 
    {'port': ':INPUTS', 'field': 'P1 Fierce Punch'}, 
    {'port': ':INPUTS', 'field': 'P2 Left'}, 
    {'port': ':EXTRA', 'field': 'P2 Short Kick'}, 
    {'port': ':EXTRA', 'field': 'P2 Forward Kick'}, 
    {'port': ':EXTRA', 'field': 'P1 Forward Kick'}, 
    {'port': ':EXTRA', 'field': 'P1 Roundhouse Kick'}, 
    {'port': ':EXTRA', 'field': 'P1 Short Kick'}
]

We advise you to create an enum of all the possible actions, then send their action values to the emulator, see the example Actions Enum

There is also the problem of transitioning games between non-learnable gameplay screens such as the title screen and character select. To see how this can be implemented please look at the provided Steps script and the Example Street Fighter III Third Strike: Fight for the Future Environment Implementation

The emulator class also has a frame_ratio argument which can be used for adjusting the frame rate seen by your algorithm. By default MAME generates frames at 60 frames per second, however, this may be too many frames for your algorithm. The toolkit by default will use a frame_ratio of 3, which means that 1 in 3 frames are sent through the toolkit, this converts the frame rate to 20 frames per second. Using a higher frame_ratio also increases the performance of the toolkit.

from MAMEToolkit.emulator import Emulator

emulator = Emulator(roms_path, game_id, memory_addresses, frame_ratio=3)

Library Performance Benchmarks with PC Specs

The development and testing of this toolkit have been completed on an 8-core AMD FX-8300 3.3GHz CPU along with a 3GB GeForce GTX 1060 GPU. With a single random agent, the street fighter environment can be run at 600%+ the normal gameplay speed. And For hogwild training with 8 random agents, the environment can be run at 300%+ the normal gameplay speed.

Simple ConvNet Agent

To ensure that the toolkit is able to train algorithms, a simple 5 layer ConvNet was setup with minimal tuning. The algorithm was able to successfully learn some simple mechanics of Street Fighter, such as combos and blocking. The Street Fighter gameplay works by having the player fight different opponents across 10 stages of increasing difficulty. Initially, the algorithm would reach stage 2 on average, but eventually could reach stage 5 on average after 2200 episodes of training. The learning rate was tracked using the net damage done vs damage taken of a single playthough for each episode.

MAME Changes

The library works by acting as a wrapper around a modified MAME implementation. The following changes were made:

  • Updated the lua console to allow for the retrieval of the format of frame data
  • Update the lua console to allow for the retrieval of the current frames data
  • Disabled game start warnings

The following files are affected:

  • src/emu/machine.cpp
  • src/emu/video.cpp
  • src/emu/video.h
  • src/frontend/mame/luaengine.cpp
  • src/frontend/mame/ui/ui.cpp
  • src/osd/sdl/window.cpp

The modified MAME implementation can be found at [https://github.com/M-J-Murray/mame]

About

A Python toolkit used to train reinforcement learning algorithms against arcade games

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Lua 77.4%
  • Python 22.6%