Skip to content

Commit

Permalink
merge Dev v1.1.0 - curriculum with yaml files
Browse files Browse the repository at this point in the history
Dev v1.1.0
  • Loading branch information
beyretb authored Sep 16, 2019
2 parents 6ebfa72 + 5fe2439 commit 2d5155d
Show file tree
Hide file tree
Showing 25 changed files with 398 additions and 57 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,9 @@ features with the agent's frames in order to have frames in line with the config

## Version History

- v1.1.0
- Add curriculum learning to `animalai-train` to use yaml configurations

- v1.0.5
- ~~Adds customisable resolution during evaluation~~ (removed, evaluation is only `84x84`)
- Update `animalai-train` to tf 1.14 to fix `gin` broken dependency
Expand Down
2 changes: 1 addition & 1 deletion animalai/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

setup(
name='animalai',
version='1.0.5',
version='1.1.0',
description='Animal AI competition interface',
url='https://github.com/beyretb/AnimalAI-Olympics',
author='Benjamin Beyret',
Expand Down
Binary file added documentation/Curriculum/0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added documentation/Curriculum/1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added documentation/Curriculum/2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added documentation/Curriculum/3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added documentation/Curriculum/4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added documentation/Curriculum/5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added documentation/Curriculum/learning.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added documentation/Curriculum/lessons.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions documentation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ You can find here the following documentation:
- [The quickstart guide](quickstart.md)
- [How to design configuration files](configFile.md)
- [How training works](training.md)
- [Add a curriculum to your training using animalai-train](curriculum.md)
- [All the objects you can include in the arenas as well as their specifications](definitionsOfObjects.md)
- [How to submit your agent](submission.md)
- [A guide to train on AWS](cloudTraining.md)
Expand Down
95 changes: 95 additions & 0 deletions documentation/curriculum.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Curriculum Learning

The `animalai-train` package contains a curriculum learning feature where you can specify a set of configuration files
which constitute lessons as part of the curriculum. See the
[ml-agents documentation](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Curriculum-Learning.md)
on curriculum learning for an overview of the technique. Our implementation is adapted from the ml-agents one, to use
configuration files rather than environment parameters (which don't exist in `animalai`).

## Meta Curriculum

To define a curriculum you will need to provide the following:

- lessons (or levels), generally of increasing difficulty, that your agent will learn on, switching from easy to more difficult
- a metric you want to monitor to switch from one level to the next
- the value for each of these thresholds

In practice, you will place these parameters in a `json` file named after the brain in the environment (`Learner.json` in
our case), and place this file in a folder with all the configuration files you wish to use. This constitutes what we call
a meta-curriculum.

## Example

An example is provided in [the example folder](../examples/configs/curriculum). The idea of this curriculum is to train
an agent to navigate a maze by creating maze like structures of perpendicular walls, starting with a single wall and food,
adding one more wall at each level. Below are samples from the 6 different levels.



![](Curriculum/0.png) |![](Curriculum/1.png)|![](Curriculum/2.png)|
:--------------------:|:-------------------:|:-------------------:
![](Curriculum/3.png) |![](Curriculum/4.png)|![](Curriculum/5.png)|

To produce such a curriculum, we define the meta-curriculum in the following `json` format:

```
{
"measure": "reward",
"thresholds": [
1.5,
1.4,
1.3,
1.2,
1.1
],
"min_lesson_length": 100,
"signal_smoothing": true,
"configuration_files": [
"0.yaml",
"1.yaml",
"2.yaml",
"3.yaml",
"4.yaml",
"5.yaml"
]
}
```

All parameters are the same as in [ml-agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Curriculum-Learning.md),
except for the `configuration_files`. From the ml-agents documentation:

* `measure` - What to measure learning progress, and advancement in lessons by.
* `reward` - Uses a measure received reward.
* `progress` - Uses ratio of steps/max_steps.
* `thresholds` (float array) - Points in value of `measure` where lesson should
be increased.
* `min_lesson_length` (int) - The minimum number of episodes that should be
completed before the lesson can change. If `measure` is set to `reward`, the
average cumulative reward of the last `min_lesson_length` episodes will be
used to determine if the lesson should change. Must be nonnegative.

__Important__: the average reward that is compared to the thresholds is
different than the mean reward that is logged to the console. For example,
if `min_lesson_length` is `100`, the lesson will increment after the average
cumulative reward of the last `100` episodes exceeds the current threshold.
The mean reward logged to the console is dictated by the `summary_freq`
parameter in the
[trainer configuration file](../examples/configs/trainer_config.yaml).
* `signal_smoothing` (true/false) - Whether to weight the current progress
measure by previous values.
* If `true`, weighting will be 0.75 (new) 0.25 (old).

The `configuration_files` parameter is simply a list of files names which contain the lessons in the order they should be loaded.
Note that if you have `n` lessons, you need to define `n-1` thresholds.


## Training

Once the folder created, training is done in the same way as before but now we pass a `MetaCurriculum` object to the
`meta_curriculum` argument of a `TrainerController`.

We provide an example using the above curriculum in [examples/trainCurriculum.py](../examples/trainCurriculum.py).
Training this agent, you can see the lessons switch using tensorboard:

![](Curriculum/learning.png)
![](Curriculum/lessons.png)
59 changes: 33 additions & 26 deletions examples/animalai_train/animalai_train/trainers/curriculum.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,19 @@
import math

from .exception import CurriculumError
from animalai.envs.arena_config import ArenaConfig

import logging

logger = logging.getLogger('mlagents.trainers')


class Curriculum(object):
def __init__(self, location):
def __init__(self, location, yaml_files):
"""
Initializes a Curriculum object.
:param location: Path to JSON defining curriculum.
:param yaml_files: A list of configuration files for each lesson
"""
self.max_lesson_num = 0
self.measure = None
Expand All @@ -32,7 +34,7 @@ def __init__(self, location):
raise CurriculumError('There was an error decoding {}'
.format(location))
self.smoothing_value = 0
for key in ['parameters', 'measure', 'thresholds',
for key in ['configuration_files', 'measure', 'thresholds',
'min_lesson_length', 'signal_smoothing']:
if key not in self.data:
raise CurriculumError("{0} does not contain a "
Expand All @@ -43,18 +45,25 @@ def __init__(self, location):
self.min_lesson_length = self.data['min_lesson_length']
self.max_lesson_num = len(self.data['thresholds'])

parameters = self.data['parameters']
for key in parameters:
# if key not in default_reset_parameters:
# raise CurriculumError(
# 'The parameter {0} in Curriculum {1} is not present in '
# 'the Environment'.format(key, location))
if len(parameters[key]) != self.max_lesson_num + 1:
raise CurriculumError(
'The parameter {0} in Curriculum {1} must have {2} values '
'but {3} were found'.format(key, location,
self.max_lesson_num + 1,
len(parameters[key])))
configuration_files = self.data['configuration_files']
# for key in configuration_files:
# if key not in default_reset_parameters:
# raise CurriculumError(
# 'The parameter {0} in Curriculum {1} is not present in '
# 'the Environment'.format(key, location))
if len(configuration_files) != self.max_lesson_num + 1:
raise CurriculumError(
'The parameter {0} in Curriculum {1} must have {2} values '
'but {3} were found'.format(key, location,
self.max_lesson_num + 1,
len(configuration_files)))
folder = os.path.dirname(location)
folder_yaml_files = os.listdir(folder)
if not all([file in folder_yaml_files for file in configuration_files]):
raise Curriculum(
'One or more configuration file(s) in curriculum {0} could not be found'.format(location)
)
self.configurations = [ArenaConfig(os.path.join(folder, file)) for file in yaml_files]

@property
def lesson_num(self):
Expand All @@ -79,15 +88,13 @@ def increment_lesson(self, measure_val):
if self.lesson_num < self.max_lesson_num:
if measure_val > self.data['thresholds'][self.lesson_num]:
self.lesson_num += 1
config = {}
parameters = self.data['parameters']
for key in parameters:
config[key] = parameters[key][self.lesson_num]
logger.info('{0} lesson changed. Now in lesson {1}: {2}'
# config = {}
# parameters = self.data['parameters']
# for key in parameters:
# config[key] = parameters[key][self.lesson_num]
logger.info('{0} lesson changed. Now in lesson {1}'
.format(self._brain_name,
self.lesson_num,
', '.join([str(x) + ' -> ' + str(config[x])
for x in config])))
self.lesson_num))
return True
return False

Expand All @@ -103,8 +110,8 @@ def get_config(self, lesson=None):
if lesson is None:
lesson = self.lesson_num
lesson = max(0, min(lesson, self.max_lesson_num))
config = {}
parameters = self.data['parameters']
for key in parameters:
config[key] = parameters[key][lesson]
config = self.configurations[lesson]
# parameters = self.data['parameters']
# for key in parameters:
# config[key] = parameters[key][lesson]
return config
53 changes: 29 additions & 24 deletions examples/animalai_train/animalai_train/trainers/meta_curriculum.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,34 +20,41 @@ def __init__(self, curriculum_folder):
Args:
curriculum_folder (str): The relative or absolute path of the
folder which holds the curriculums for this environment.
The folder should contain JSON files whose names are the
brains that the curriculums belong to.
The folder should contain one JSON file which name is the
same as the brains in the academy (e.g Learner) and contains
the parameters for the curriculum as well as all the YAML
files for each curriculum lesson
"""
used_reset_parameters = set()
# used_reset_parameters = set()
self._brains_to_curriculums = {}
self._configuration_files = []

try:
for curriculum_filename in os.listdir(curriculum_folder):
json_files = [file for file in os.listdir(curriculum_folder) if '.json' in file.lower()]
yaml_files = [file for file in os.listdir(curriculum_folder) if
('.yaml' in file.lower() or '.yml' in file.lower())]
for curriculum_filename in json_files:
brain_name = curriculum_filename.split('.')[0]
curriculum_filepath = \
os.path.join(curriculum_folder, curriculum_filename)
curriculum = Curriculum(curriculum_filepath)
curriculum = Curriculum(curriculum_filepath, yaml_files)

# ===== TO REMOVE ??? ===========
# Check if any two curriculums use the same reset params.
if any([(parameter in curriculum.get_config().keys())
for parameter in used_reset_parameters]):
logger.warning('Two or more curriculums will '
'attempt to change the same reset '
'parameter. The result will be '
'non-deterministic.')

used_reset_parameters.update(curriculum.get_config().keys())
# if any([(parameter in curriculum.get_config().keys())
# for parameter in used_reset_parameters]):
# logger.warning('Two or more curriculums will '
# 'attempt to change the same reset '
# 'parameter. The result will be '
# 'non-deterministic.')
#
# used_reset_parameters.update(curriculum.get_config().keys())
# ===== end of to remove =========
self._brains_to_curriculums[brain_name] = curriculum
except NotADirectoryError:
raise MetaCurriculumError(curriculum_folder + ' is not a '
'directory. Refer to the ML-Agents '
'curriculum learning docs.')

'directory. Refer to the ML-Agents '
'curriculum learning docs.')

@property
def brains_to_curriculums(self):
Expand Down Expand Up @@ -83,7 +90,7 @@ def _lesson_ready_to_increment(self, brain_name, reward_buff_size):
increment its lesson.
"""
return reward_buff_size >= (self.brains_to_curriculums[brain_name]
.min_lesson_length)
.min_lesson_length)

def increment_lessons(self, measure_vals, reward_buff_sizes=None):
"""Attempts to increments all the lessons of all the curriculums in this
Expand All @@ -108,14 +115,13 @@ def increment_lessons(self, measure_vals, reward_buff_sizes=None):
if self._lesson_ready_to_increment(brain_name, buff_size):
measure_val = measure_vals[brain_name]
ret[brain_name] = (self.brains_to_curriculums[brain_name]
.increment_lesson(measure_val))
.increment_lesson(measure_val))
else:
for brain_name, measure_val in measure_vals.items():
ret[brain_name] = (self.brains_to_curriculums[brain_name]
.increment_lesson(measure_val))
.increment_lesson(measure_val))
return ret


def set_all_curriculums_to_lesson_num(self, lesson_num):
"""Sets all the curriculums in this meta curriculum to a specified
lesson number.
Expand All @@ -127,18 +133,17 @@ def set_all_curriculums_to_lesson_num(self, lesson_num):
for _, curriculum in self.brains_to_curriculums.items():
curriculum.lesson_num = lesson_num


def get_config(self):
"""Get the combined configuration of all curriculums in this
MetaCurriculum.
Returns:
A dict from parameter to value.
"""
config = {}
# config = {}

for _, curriculum in self.brains_to_curriculums.items():
curr_config = curriculum.get_config()
config.update(curr_config)
# config.update(curr_config)

return config
return curr_config
Original file line number Diff line number Diff line change
Expand Up @@ -180,11 +180,11 @@ def _reset_env(self, env):
environment.
"""
if self.meta_curriculum is not None:
return env.reset(config=self.meta_curriculum.get_config())
return env.reset(arenas_configurations=self.meta_curriculum.get_config())
else:
if self.update_config:
return env.reset(arenas_configurations=self.config)
self.update_config = False
return env.reset(arenas_configurations=self.config)
else:
return env.reset()

Expand Down
2 changes: 1 addition & 1 deletion examples/animalai_train/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

setup(
name='animalai_train',
version='1.0.5',
version='1.1.0',
description='Animal AI competition training library',
url='https://github.com/beyretb/AnimalAI-Olympics',
author='Benjamin Beyret',
Expand Down
23 changes: 23 additions & 0 deletions examples/configs/curriculum/0.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
!ArenaConfig
arenas:
0: !Arena
t: 250
items:
- !Item
name: Wall
positions:
- !Vector3 {x: -1, y: 0, z: 10}
colors:
rotations: [90]
sizes:
- !Vector3 {x: 1, y: 5, z: 9}
- !Item
name: GoodGoal
positions:
- !Vector3 {x: -1, y: 0, z: 35}
sizes:
- !Vector3 {x: 2, y: 2, z: 2}
- !Item
name: Agent
positions:
- !Vector3 {x: -1, y: 1, z: 5}
Loading

0 comments on commit 2d5155d

Please sign in to comment.