Skip to content

CleanRL 0.2.1 with SAC added and video recording feature.

Compare
Choose a tag to compare
@vwxyzjn vwxyzjn released this 09 Jan 22:00
· 720 commits to master since this release

We've made the SAC algorithm works for both continuous and discrete action spaces, with primary references from the following papers:

https://arxiv.org/abs/1801.01290
https://arxiv.org/abs/1812.05905
https://arxiv.org/abs/1910.07207

My personal thanks to everyone who participated in the monthly dev cycle and, in particular, @dosssman who implemented the SAC with discrete action spaces.

Additional improvement include
support gym.wrappers.Monitor to automatically record agent’s performance at certain episodes (default is 1, 2, 9, 28, 65, ... 1000, 2000, 3000) and integrate with wandb. (so cool, see screenshot below) #4
Use the same replay buffer from minimalRL for DQN and SAC #5

https://app.wandb.ai/cleanrl/cleanrl.benchmark

image