Project 1: Navigation

Introduction

This project is meant to respond to the requirements of the Udacity Deep Reinforcement Learning Project 1: Navigation.

The banana environment is set up for the student and looks like the image above when rendered.

A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:

0 - move forward.
1 - move backward.
2 - turn left.
3 - turn right.

The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.

Getting Started

Dependencies

Create (and activate) a new environment with Python 3.6.

Linux or Mac:

conda create --name drlnd python=3.6
source activate drlnd

Windows:

conda create --name drlnd python=3.6 
activate drlnd

Follow the instructions in this repository to perform a minimal install of OpenAI gym.
Clone the repository (if you haven't already!), and navigate to the python/ folder. Then, install several dependencies.

git clone https://github.com/udacity/deep-reinforcement-learning.git
cd deep-reinforcement-learning/python
pip install .

Create an IPython kernel for the drlnd environment.

python -m ipykernel install --user --name drlnd --display-name "drlnd"

Before running code in a notebook, change the kernel to match the drlnd environment by using the drop-down Kernel menu.
Refer to Navigation.ipynb.

If you are running on a machine without a display, you will need to set a no_graphics = True flag when the environment is instantiated, but this will make the output pretty boring.

If you'd like to see the untrained agent blunder around the environment, un-comment the code under section 3.

The first cell of code in section 4.1 is where we train the agent with hyper parameters previously determined through manually blundering around with the or running the loop in 4.2. This setup meets the assignment criteria of +13 reward averaged over 100 episodes.

Finally, you can watch the trained agent running parameters from 4.1.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Banana_Windows_x86_64		Banana_Windows_x86_64
__pycache__		__pycache__
Navigation.ipynb		Navigation.ipynb
Navigation_Pixels.ipynb		Navigation_Pixels.ipynb
README.md		README.md
Report.md		Report.md
Untitled.ipynb		Untitled.ipynb
checkpoint.pth		checkpoint.pth
checkpoint0.pth		checkpoint0.pth
checkpoint1.pth		checkpoint1.pth
checkpoint2.pth		checkpoint2.pth
checkpoint3.pth		checkpoint3.pth
checkpoint4.pth		checkpoint4.pth
checkpoint5.pth		checkpoint5.pth
checkpoint6.pth		checkpoint6.pth
checkpoint7.pth		checkpoint7.pth
dqn_agent.py		dqn_agent.py
eps_ptone.PNG		eps_ptone.PNG
eps_zero.PNG		eps_zero.PNG
model.py		model.py
unity-environment.log		unity-environment.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 1: Navigation

Introduction

Getting Started

Dependencies

About

Releases

Packages

Languages

shogan50/p1_navigation

Folders and files

Latest commit

History

Repository files navigation

Project 1: Navigation

Introduction

Getting Started

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages