Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications

Daniel S. Brown and Scott Niekum

Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for policy improvement and generalization. However, despite much recent interest in IRL, little work has been done to understand the minimum set of demonstrations needed to teach a specific sequential decision-making task. We formalize the problem of finding maximally informative demonstrations for IRL as a machine teaching problem where the goal is to find the minimum number of demonstrations needed to specify the reward equivalence class of the demonstrator. We extend previous work on algorithmic teaching for sequential decision-making tasks by showing a reduction to the set cover problem which enables an efficient approximation algorithm for determining the set of maximally-informative demonstrations. We apply our proposed machine teaching algorithm to two novel applications: providing a lower bound on the number of queries needed to learn a policy using active IRL and developing a novel IRL algorithm that can learn more efficiently from informative demonstrations than a standard IRL approach.

Follow the instructions below to reproduce results in our AAAI 2019 paper on Machine Teaching for Inverse Reinforcement Learning.

If you find this code useful in your research, please cite our paper:

@inproceedings{brown2019machine,
  title={Machine teaching for inverse reinforcement learning: Algorithms and applications},
  author={Brown, Daniel S and Niekum, Scott},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={33},
  pages={7749--7758},
  year={2019}
}

Dependencies

Matplotlib (for generating figures): https://matplotlib.org/users/installing.html
lpsolve
eigen

Linux instructions for downloading LPSolve

Download lp_solve_5.5.2.5_dev_ux64.tar.gz from https://sourceforge.net/projects/lpsolve/files/lpsolve/5.5.2.5/
Create a directory for the files mkdir -p ~/Libraries/lpsolve55
Then extract the files from the tarball into the directory ~/Libraries/lpsolve55

Getting started

Make a build directory: mkdir build
Make a data directory to hold results: mkdir data

Comparison of Machine Teaching Algorithms (Table 1, in our paper)

Create a directory for data mkdir -p data/algorithm_comparison
Use make algo_comp to build the experiment.
Execute ./algo_comp scot random 1 to run scot algorithm. Data will be output to ./data/algorithm_comparison
Execute ./algo_comp cakmak random 100000
Execute ./algo_comp cakmak random 1000000
Execute ./algo_comp cakmak random 10000000
Experiments will take some time to run since it runs 20 replicates for each number of demonstrations. SCOT should be pretty fast, the main bottleneck is running IRL after SCOT is finished.
Experiment parameters can be set in src/machineTeachingAlgorithmComparison.cpp.
Once experiment has finished run python scripts/makeTableComparingMachineTeachingAlgorithms.py to generate table
You should get something similar to the following table. Note due to changes to seed and other parameters the results are slightly different from Table in paper.

Algorithm	Ave. number of (s,a) pairs	Ave. % incorrect actions	Ave. time (s)
SCOT	17.250	0.185	1.274
UVM (100000)	3.450	37.531	951.479
UVM (1000000)	4.200	33.457	2096.638
UVM (10000000)	4.895	37.427	12068.098

The main thing to take away is that the UVM algorithm doesn't work well because it relies on MC volume estimation, whereas SCOT is fast and finds highly informative demonstration sets that allow IRL to find a good policy.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
include		include
scripts		scripts
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications

Daniel S. Brown and Scott Niekum

Follow the instructions below to reproduce results in our AAAI 2019 paper on Machine Teaching for Inverse Reinforcement Learning.

If you find this code useful in your research, please cite our paper:

Dependencies

Linux instructions for downloading LPSolve

Getting started

Comparison of Machine Teaching Algorithms (Table 1, in our paper)

About

Uh oh!

Releases

Packages

Languages

dsbrown1331/machine-teaching-irl

Folders and files

Latest commit

History

Repository files navigation

Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications

Daniel S. Brown and Scott Niekum

Follow the instructions below to reproduce results in our AAAI 2019 paper on Machine Teaching for Inverse Reinforcement Learning.

If you find this code useful in your research, please cite our paper:

Dependencies

Linux instructions for downloading LPSolve

Getting started

Comparison of Machine Teaching Algorithms (Table 1, in our paper)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages