Skip to content

Milestones

List view

  • - Read more into policy gradients; e.g. TRPO paper - Perhaps do the Berkeley Homework on implementing basic policy gradients. - Implement more advanced versions of policy gradients?

    Overdue by 5 year(s)
    Due by August 5, 2020
    6/8 issues closed
  • Learn the SOTA methods for DQN - Dueling DQN - Double DQN - DQfD (Deep Q Learning from Demonstrations) Learn about Inverse RL and how we can use it as a replacement for manual subrewards from ForgER

    Overdue by 5 year(s)
    Due by August 19, 2020
    3/7 issues closed