Skip to content

Latest commit

 

History

History
 
 

GA3C

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Reproduce GA3C with PARL

Based on PARL, the GA3C algorithm of deep reinforcement learning has been reproduced, reaching the same level of indicators as the paper in Atari benchmarks.

Original paper: GA3C: GPU-based A3C for Deep Reinforcement Learning

A hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm.

Atari games introduction

Please see here to know more about Atari games.

Benchmark result

Results with one learner (in a P40 GPU) and 24 simulators (in 12 CPU) in 10 million sample steps. GA3C_Pong GA3C_Breakout GA3C_BeamRider GA3C_Qbert GA3C_SpaceInvaders

How to use

Dependencies

Distributed Training

At first, We can start a local cluster with 24 CPUs:

xparl start --port 8010 --cpu_num 24

Note that if you have started a master before, you don't have to run the above command. For more information about the cluster, please refer to our documentation

Then we can start the distributed training by running:

python train.py

[Tips] The performance can be influenced dramatically in a slower computational environment, especially when training with low-speed CPUs. It may be caused by the policy-lag problem.

Reference