GitHub · Where software is built

Milestones

More Policy Gradients
- Read more into policy gradients; e.g. TRPO paper - Perhaps do the Berkeley Homework on implementing basic policy gradients. - Implement more advanced versions of policy gradients?
Overdue by 5 year(s)
•
Due by August 5, 2020
•6/8 issues closed
75% complete2 open 6 closed
Delve into Deep Q-Learning and Inverse RL
Learn the SOTA methods for DQN - Dueling DQN - Double DQN - DQfD (Deep Q Learning from Demonstrations) Learn about Inverse RL and how we can use it as a replacement for manual subrewards from ForgER
Overdue by 5 year(s)
•
Due by August 19, 2020
•3/7 issues closed
42% complete4 open 3 closed