Reinforcement-learning

This is a PyTorch implementation of Deep Q-Learning.

Example (1)- A result from the Atari_Breakout.

Experience memory capacity: 80000 set
Random action ratio: from 1.0 to 0.1 through 1000000 frames

54_8520.mp4

Blue line: Score of training episode.
Magenta line: Mean score of last 100 training episodes.
Red star: Score of target agent.

Experience memory capacity: 10000 set
Random action ratio: from 0.9 to 0.05 through 200 episodes

221_172.mp4

Blue line: Score of training episode.
Magenta line: Mean score of last 100 training episodes.