Merge pull request #39 from beyretb/submission

v 1.0.3 - Submission merge
beyretb · Jul 8, 2019 · 8f20b38 · 8f20b38
2 parents 86d5214 + 2d62178
commit 8f20b38
Show file tree

Hide file tree

Showing 32 changed files with 473 additions and 171 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,7 @@
 env/*
 !env/README.md
+examples/submission/test_submission/env/*
+!examples/submission/test_submission/env/README.md
 models/
 summaries/
 /.idea
@@ -11,4 +13,4 @@ venv/
 build/
 dist/
 *.egg-info*
-*.eggs*
+*.eggs*
diff --git a/README.md b/README.md
@@ -4,7 +4,7 @@
   <img height="300" src="documentation/PrefabsPictures/steampunkFOURcrop.png">
 </p>
 
-**July 1st - November 1st:** Entries will be available on EvalAI from July 8th (please be patient while we make sure everything is running smoothly).
+**July 1st - November 1st**
 
 The Animal-AI Olympics is an AI competition with tests inspired by animal cognition. Participants are given a small environment with just seven different classes of objects that can be placed inside. In each test, the agent needs to retrieve the food in the environment, but to do so there are obstacles to overcome, ramps to climb, boxes to push, and areas that must be avoided. The real challenge is that we don't provide the tests in advance. It's up to you to explore the possibilities with the environment and build interesting configurations that can help create an agent that understands how the environment's physics work and the affordances that it has. The final submission should be an agent capable of robust food retrieval behaviour similar to that of many kinds of animals. We know the animals can pass these tests, it's time to see if AI can too.
 
@@ -20,22 +20,29 @@ The Animal-AI Olympics is an AI competition with tests inspired by animal cognit
 
 See [competition launch page](https://mdcrosby.com/blog/animalailaunch.html) and official rules for further details.
 
-**Important** Please check the competition rules [here](http://animalaiolympics.com/rules.html). Entry to the competition (via EvalAI) constitutes agreement with all competition rules.
+**Important** Please check the competition rules [here](http://animalaiolympics.com/rules.html). **To submit to the competition and be considered for prizes you must also fill in [this form](https://forms.gle/PKCgp2JAWvjf4c9i6)**. Entry to the competition (via EvalAI) constitutes agreement with all competition rules. 
 
 ## Overview
 
-Here you will find all the code needed to compete in this new challenge. This repo contains **the training environment** (v1.0) that will be used for the competition. Please check back during the competition for minor bug-fixes and updates, but as of v1.0 the major features and contents are set in place. **Information for entering** will be added by July 8th when the submission will be available via the EvalAI website for the compeition.
+Here you will find all the code needed to compete in this new challenge. This repo contains **the training environment** (v1.0) that will be used for the competition. Information for entering can be found in the [submission documentation](documentation/submission.md). Please check back during the competition for minor bug-fixes and updates, but as of v1.0 the major features and contents are set in place.
 
 For more information on the competition itself and to stay updated with any developments, head to the 
-[Competition Website](http://www.animalaiolympics.com/) and follow [@MacroPhilosophy](https://twitter.com/MacroPhilosophy) 
-and [@BenBeyret](https://twitter.com/BenBeyret) on twitter.
+[Competition Website](http://www.animalaiolympics.com/) and follow [@MacroPhilosophy](https://twitter.com/MacroPhilosophy) and [@BenBeyret](https://twitter.com/BenBeyret) on twitter.
 
 The environment contains an agent enclosed in a fixed sized arena. Objects can spawn in this arena, including positive 
-and negative rewards (green, yellow and red spheres) that the agent must obtain (or avoid). All of the hidden tests that will appear in the competition are made using the objects in the training environment. We have provided some sample environment configurations that should be useful for training, but part of the challenge will be experimenting and designing new configurations.
+and negative rewards (green, yellow and red spheres) that the agent must obtain (or avoid). All of the hidden tests that will appear in the competition are made using the objects in the training environment. We have provided some sample environment configurations that should be useful for training (see examples/configs), but part of the challenge is to experiment and design new configurations.
 
 To get started install the requirements below, and then follow the [Quick Start Guide](documentation/quickstart.md). 
 More in depth documentation can be found on the [Documentation Page](documentation/README.md).
 
+## Evaluation
+
+The competition has 300 tests, split over ten categories. The categories range from the very simple (e.g. **food retrieval**, **preferences**, and **basic obstacles**) to the more complex (e.g. **spatial reasoning**, **internal models**, **object permanence**, and **causal reasoning**). We have included example config files for the first seven categories. Note that the example config files are just simple examples to be used as a guide. An agent that solves even all of these perfectly may still not be able to solve all the tests in the category, but it would be off to a good start.
+
+The submission website allows you to submit an agent that will be run on all 300 tests and it returns the overall score (number of tests passed) and score per category. We cannot offer infinite compute, so instances will be timed out after ~90 minutes and only tests performed up to that point counted (all others will be considered failed). See the [submission documentation](documentation/submission.md) for more information. 
+
+For the mid-way and final evaluation we will (resources permitting) run more extensive testing with 3 variations per test (so 900 tests total). The variations will include minor perturbations to the configurations. The agent will have to pass all 3 variations to pass each individual test, giving a total score out of 300. This means that **your final test score might be lower than the score achieved during the competition** and that **the competition leaderboard on EvalAI may not exactly match the final results**. 
+
 ## Development Blog
 
 You can read the launch post - with information about prizes and the categories in the competition here:
@@ -53,12 +60,14 @@ well as part of the development process.
 
 ## Requirements
 
-The Animal-AI package works on Linux, Mac and Windows, as well as most Cloud providers. 
+The Animal-AI package works on Linux, Mac and Windows, as well as most Cloud providers. Note that for submission to the competition we only support linux-based Docker files.  
 <!--, for cloud engines check out [this cloud documentation](documentation/cloud.md).-->
 
-First of all your will need `python3.6` installed (we currently only support **python3.6**). We recommend using a virtual environment specifically for the competition. Clone this repository to run the examples we provide you with. We offer two packages for this competition:
+We recommend using a virtual environment specifically for the competition. You will need `python3.6` installed (we currently only support **python3.6**). Clone this repository to run the examples we provide.
+
+We offer two packages for this competition:
 
-- The main one is an API for interfacing with the Unity environment. It contains both a 
+- The main package is an API for interfacing with the Unity environment. It contains both a 
 [gym environment](https://github.com/openai/gym) as well as an extension of Unity's 
 [ml-agents environments](https://github.com/Unity-Technologies/ml-agents/tree/master/ml-agents-envs). You can install it
  via pip:
@@ -105,19 +114,9 @@ mode. Here you can control the agent with the following:
 | C   | switch camera       |
 | R   | reset environment   |
 
-## Competition Tests
-
-We will be releasing further details about the tests in the competition over the coming weeks. The tests will be split 
-into multiple categories from the very simple (e.g. **food retrieval**, **preferences**, and **basic obstacles**) to 
-the more complex (e.g. **working memory**, **spatial memory**, **object permanence**, and **object manipulation**). For 
-now we have included multiple example config files that each relate to a different category. As we release further 
-details we will also specify the rules for the type of tests that can appear in each category. Note that the example 
-config files are just simple examples to be used as a guide. An agent that solves even all of these perfectly may still 
-not be able to solve all the tests in the categories but it would be off to a very good start.
-
 ## Citing
 
-For now please cite the [Nature: Machine Intelligence piece](https://rdcu.be/bBCQt) for any work involving the competition environment. Official Animal-AI Papers to follow:
+**Official Animal-AI Papers Coming Soon**. In the meantime please cite the [Nature: Machine Intelligence piece](https://rdcu.be/bBCQt) for any work involving the competition environment. 
 
 Crosby, M., Beyret, B., Halina M. [The Animal-AI Olympics](https://www.nature.com/articles/s42256-019-0050-3) Nature 
 Machine Intelligence 1 (5) p257 2019.
@@ -134,6 +133,12 @@ possibility to change the configuration of arenas between episodes. The document
 Juliani, A., Berges, V., Vckay, E., Gao, Y., Henry, H., Mattar, M., Lange, D. (2018). [Unity: A General Platform for 
 Intelligent Agents.](https://arxiv.org/abs/1809.02627) *arXiv preprint arXiv:1809.02627*
 
+## EvalAI
+
+The competition is kindly hosted on [EvalAI](https://github.com/Cloud-CV/EvalAI), an open source web application for AI competitions. Special thanks to Rishabh Jain for his help in settting this up.
+
+Deshraj Yadav, Rishabh Jain, Harsh Agrawal, Prithvijit Chattopadhyay, Taranjeet Singh, Akash Jain, Shiv Baran Singh, Stefan Lee and Dhruv Batra (2019) [EvalAI: Towards Better Evaluation Systems for AI Agents](https://arxiv.org/abs/1902.03570)
+
 ## Known Issues
 
 In play mode pressing `R` or `C` does nothing sometimes. This is due to the fact that we have synchronized these 
@@ -154,9 +159,16 @@ v0.6.1)
 
 ## Version History
 
+- v1.0.3
+    - Adds inference mode to Gym environment
+    - Adds seed to Gym Environment
+    - Submission example folder containing a trained agent
+    - Provide submission details for the competition
+    - Documentation for training on AWS
+
 - v1.0.2
     - Adds custom resolution for docker training as well
-    - fix version checker
+    - Fix version checker
 
 - v1.0.0
     - Adds custom resolution to both Unity and Gym environments

diff --git a/agent.py b/agent.py
@@ -1,11 +1,9 @@
-from animalai.envs.brain import BrainInfo
-
-
 class Agent(object):
 
     def __init__(self):
         """
          Load your agent here and initialize anything needed
+         WARNING: any path to files you wish to access on the docker should be ABSOLUTE PATHS
         """
         pass
 
@@ -16,16 +14,21 @@ def reset(self, t=250):
         :param t the number of timesteps in the episode
         """
 
-    def step(self, brain_info: BrainInfo) -> list[float]:
+    def step(self, obs, reward, done, info):
         """
-        A single step the agent should take based on the current
-        :param brain_info:  a single BrainInfo containing the observations and reward for a single step for one agent
-        :return:            a list of actions to execute (of size 2)
+        A single step the agent should take based on the current state of the environment
+        We will run the Gym environment (AnimalAIEnv) and pass the arguments returned by env.step() to
+        the agent.
+
+        Note that should if you prefer using the BrainInfo object that is usually returned by the Unity
+        environment, it can be accessed from info['brain_info'].
+
+        :param obs: agent's observation of the current environment
+        :param reward: amount of reward returned after previous action
+        :param done: whether the episode has ended.
+        :param info: contains auxiliary diagnostic information, including BrainInfo.
+        :return: the action to take, a list or size 2
         """
+        action = [0, 0]
 
-        self.action = [0, 0]
-
-        return self.action
-
-    def destroy(self):
-        pass
+        return action
diff --git a/animalai/animalai/envs/gym/environment.py b/animalai/animalai/envs/gym/environment.py
@@ -30,9 +30,11 @@ def __init__(self,
                  worker_id=0,
                  docker_training=False,
                  n_arenas=1,
+                 seed=0,
                  arenas_configurations=None,
                  greyscale=False,
                  retro=True,
+                 inference=False,
                  resolution=None):
         """
         Environment initialization
@@ -48,12 +50,15 @@ def __init__(self,
         """
         self._env = UnityEnvironment(file_name=environment_filename,
                                      worker_id=worker_id,
+                                     seed=seed,
                                      docker_training=docker_training,
                                      n_arenas=n_arenas,
                                      arenas_configurations=arenas_configurations,
+                                     inference=inference,
                                      resolution=resolution)
         # self.name = self._env.academy_name
         self.vector_obs = None
+        self.inference = inference
         self.resolution = resolution
         self._current_state = None
         self._n_agents = None

diff --git a/animalai/setup.py b/animalai/setup.py
@@ -2,7 +2,7 @@
 
 setup(
     name='animalai',
-    version='1.0.2',
+    version='1.0.3',
     description='Animal AI competition interface',
     url='https://github.com/beyretb/AnimalAI-Olympics',
     author='Benjamin Beyret',

diff --git a/documentation/AWS/EC2.png b/documentation/AWS/EC2.png
diff --git a/documentation/AWS/launch.png b/documentation/AWS/launch.png
diff --git a/documentation/AWS/limits.png b/documentation/AWS/limits.png
diff --git a/documentation/AWS/marketplace.png b/documentation/AWS/marketplace.png
diff --git a/documentation/AWS/p2.png b/documentation/AWS/p2.png
diff --git a/documentation/README.md b/documentation/README.md
@@ -6,6 +6,8 @@ You can find here the following documentation:
 - [How to design configuration files](configFile.md)
 - [How training works](training.md)
 - [All the objects you can include in the arenas as well as their specifications](definitionsOfObjects.md)
+- [How to submit your agent](submission.md)
+- [A guide to train on AWS](cloudTraining.md)
 
 
 More will come before the competition launches.
diff --git a/documentation/cloudTraining.md b/documentation/cloudTraining.md
@@ -0,0 +1,98 @@
+# Training on AWS
+
+Training an agent requires rendering the environment on a screen, which means that you may have to follow a few steps (detailed below) before you can use standard cloud compute instances. We detail two possibilities. Both methods were tested on [AWS p2.xlarge](https://aws.amazon.com/ec2/instance-types/p2/) using a standard [Deep Learning Base AMI](https://aws.amazon.com/marketplace/pp/B077GCZ4GR). 
+
+We leave participants the task of adapting the information found here to different cloud providers and/or instance types or for their specific use-case. We do not have the resources to fully support this capability. We are providing the following purely in the hopes it serves as a useful guide for some.
+
+**WARNING: using cloud services will incur costs, carefully read your provider terms of service**
+
+## Pre-requisite: setup an AWS p2.xlarge instance
+
+Start by creating an account on [AWS](https://aws.amazon.com/), and then open the [console](https://console.aws.amazon.com/console/home?). 
+Compute engines on AWS are called `EC2` and offer a vast range of configurations in terms of number and type of CPUs, GPUs, 
+memory and storage. You can find more details about the different types and prices [here](https://aws.amazon.com/ec2/pricing/on-demand/). 
+In our case, we will use a `p2.xlarge instance`, in the console select `EC2`:
+
+![EC2](AWS/EC2.png)
+
+by default you will have a limit restriction on the number of instances you can create. Check your limits by selecting `Limits` on the top 
+left menu:
+
+![EC2](AWS/limits.png)
+
+Request an increase for `p2.xlarge` if needed. Once you have at least a limit of 1, go back to the EC2 console and select launch instance:
+
+![EC2](AWS/launch.png)
+
+You can then select various images, type in `Deep learning` to see what is on offer, for now we recommend to select `AWS Marketplace` on the left panel:
+
+![EC2](AWS/marketplace.png)
+
+and select either `Deep Learning Base AMI (Ubuntu)` if you want a basic Ubuntu install with CUDA capabilities. On the next page select `p2.xlarge` (this will not be selected by default):
+
+![EC2](AWS/p2.png)
+
+Click `Next` twice (first Next: Configure Instance Deatils, then Next: Add Storage) and add at least 15 Gb of storage to the current size (so at least 65 total with a default of 50). Click `review and launch`, and then `launch`. You will then be asked to create or select existing key pairs which will be used to ssh to your instance.
+
+Once your instances is started, it will appear on the EC2 console. To ssh into your instance, right click the line, select connect and follow the instructions. 
+We can now configure our instance for training. **Don't forget to shutdown your instance once you're done using it as you get charged as long as it runs**.
+
+## Simulating a screen
+
+As cloud engines do not have screens attached, rendering the environment window is impossible. We use a virtual screen instead, in the form of [xvfb](https://en.wikipedia.org/wiki/Xvfb). 
+You can follow either one of the following methods to use this. In both, **remember** to select `docker_training=True` in your environment configuration.
+
+
+## Method 1: train using docker
+
+Basic Deep Learning Ubuntu images provide [NVIDIA docker](https://devblogs.nvidia.com/nvidia-docker-gpu-server-application-deployment-made-easy/) 
+pre-installed, which allows the use of CUDA within a container. SSH into your AWS instance, clone this repo and follow the instructions below.
+
+In the [submission guide](submission.md) we describe how to build a docker container for submission. The same process 
+can be used to create a docker for training an agent. The [dockerfile provided](../examples/submission/Dockerfile) can 
+be adapted to include all the libraries and code needed for training.
+
+For example, should you wish to train a standard Dopamine agent provided in `animalai-train` out of the box, using GPU compute, add the following 
+lines to your docker in the `YOUR COMMANDS GO HERE` part, below the line installing `animalai-train`:
+
+```
+RUN git clone https://github.com/beyretb/AnimalAI-Olympics.git
+RUN pip uninstall --yes tensorflow
+RUN pip install tensorflow-gpu==1.12.2
+RUN apt-get install unzip wget
+RUN wget https://www.doc.ic.ac.uk/~bb1010/animalAI/env_linux_v1.0.0.zip
+RUN mv env_linux_v1.0.0.zip AnimalAI-Olympics/env/
+RUN unzip AnimalAI-Olympics/env/env_linux_v1.0.0.zip -d AnimalAI-Olympics/env/
+WORKDIR /aaio/AnimalAI-Olympics/examples
+RUN sed -i 's/docker_training=False/docker_training=True/g' trainDopamine.py
+```
+
+Build your docker, from the `examples/submission` folder run:
+
+```
+docker build --tag=test-training .
+```
+
+Once built, you can start training straight away by running:
+
+```
+docker run --runtime=nvidia test-training python trainDopamine.py
+```
+
+Notice the use of `--runtime=nvidia` which activates CUDA capabilities. You should see the following tensorflow line in the output 
+which confirms you are training using the GPU:
+
+```
+I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
+name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.823
+```
+
+You're now ready to start training on AWS using docker!
+
+## Method 2: install xvfb on the instance
+
+An alternative to docker is to install `xvfb` directly on your AWS instance and use it in the same way you would when training on your home computer. For this you will want to install an Ubuntu image with some deep learning libraries installed. From the AWS Marketplace page you can for example install `Deep Learning AMI (Ubuntu)` which contains tensorflow and pytorch.
+
+To do so, you can follow the original ML Agents description for `p2.xlarge` found [here](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-on-Amazon-Web-Service.md#setting-up-x-server-optional). From our 
+experience, these steps do not work as well on other types of instances.
+