You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am struggling to get a good training result on NL Texas Hold'em by DQN. I follow your paper to choose hyperparameters as: the memory size is selected in {2000, 100000}, the discount factor is set to 0.99, Adam optimizer is applied with learning rate 0.00005, and the network structure is MLP with size 10-10 128-128, 512-512 or 512-1024-2048-1024-512(I have tried them all). But can only get a bad results like:
And the result in your paper like:(with the same amount of training timesteps but got much more rewards)
The text was updated successfully, but these errors were encountered:
@jiahui-x Hi, thanks for the feedback. This result in the paper is out-of-date. This is possibly due to multiple factors. First, the environment codebase has some major updates since the first release, particularly the reward has been divided by 2. Second, the current implementation is based on Torch instead of TensorFlow. Thus, your result seems reasonable to me.
The codebase is expected to remain stable in the near future to ensure reproducibility.
Hi, I am struggling to get a good training result on NL Texas Hold'em by DQN. I follow your paper to choose hyperparameters as: the memory size is selected in {2000, 100000}, the discount factor is set to 0.99, Adam optimizer is applied with learning rate 0.00005, and the network structure is MLP with size 10-10 128-128, 512-512 or 512-1024-2048-1024-512(I have tried them all). But can only get a bad results like:


And the result in your paper like:(with the same amount of training timesteps but got much more rewards)
The text was updated successfully, but these errors were encountered: