Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2 #1154

durantagre · 2022-03-12T23:52:23Z

Dear altruists, I am new at stable baselines and RL. I am trying to retrain my previously trained PPO1 model as like it will start learning from where it was left in the previous training. What I am trying to do is :

loading my previously trained model from my computer and then re-train it from the point it ended it’s last training. For that, I am loading my previously saved model inside policy_fn() and I am giving policy_fn as parameter inside pposgd_simple.learn() method. It shows error "ValueError: At least two variables have the same name: pi/obfilter/count"

Also, I am unsure of whether it starts the training from the previous ending point or whether it started the training from the very beginning (when it trains correctly in a different setting). Can anyone please help me directing the way to verify it. One option may be printing the model parameters, but I am unsure of it.

I am also trying to use Tensorboard to monitor my training. But when I run the training, the program says “tensorboard_log=logger_path, TypeError: learn() got an unexpected keyword argument 'tensorboard_log'.” My stable baselines version 2.10.2. I am attaching my entire code of training below. I would appreciate any suggestions from you. Thanks in advance.

def make_env(seed=None):
reward_scale = 1.0

rank = MPI.COMM_WORLD.Get_rank()
myseed = seed + 1000 * rank if seed is not None else None
set_global_seeds(myseed)
env = Env()


env = Monitor(env, logger_path, allow_early_resets=True)

env.seed(seed)
if reward_scale != 1.0:
from baselines.common.retro_wrappers import RewardScaler

env = RewardScaler(env, reward_scale)
return env


def train(num_timesteps, path=None):

from baselines.ppo1 import mlp_policy, pposgd_simple

sess = U.make_session(num_cpu=1)
sess.__enter__()

def policy_fn(name, ob_space, ac_space):
	policy = mlp_policy.MlpPolicy(name=name, ob_space=ob_space, ac_space=ac_space,
			hid_size=64, num_hid_layers=3)
	saver = tf.train.Saver()
	if path is not None:
		print("Tried to restore from ", path)
		U.initialize()
		saver.restore(tf.get_default_session(), path)
		saver2 = tf.train.import_meta_graph('/srcs/src/models/model1.meta')
		model = saver.restore(sess,tf.train.latest_checkpoint('/srcs/src/models/'))
		#return policy
		return saver2

env = make_env()


pi = pposgd_simple.learn(env, policy_fn,
max_timesteps=num_timesteps,
timesteps_per_actorbatch=1024,
clip_param=0.2, entcoeff=0.0,
optim_epochs=10,
optim_stepsize=5e-5,
optim_batchsize=64,
gamma=0.99,
lam=0.95,
schedule='linear',
tensorboard_log=logger_path,
#tensorboard_log="./ppo1_tensorboard/",
)
env.env.plotSave()
saver = tf.train.Saver(tf.all_variables())
saver.save(sess, '/models/model1') 
return pi


def main():
logger.configure()
path_ = "/models/model1" 
train(num_timesteps=409600, path=path_) 
if __name__ == '__main__':
rank = MPI.COMM_WORLD.Get_rank()
logger_path = None if logger.get_dir() is None else os.path.join(logger.get_dir(), str(rank))
main()

The text was updated successfully, but these errors were encountered:

Miffyli · 2022-03-13T14:26:39Z

Seems like you are confusing OpenAI baselines with stable-baselines. In stable-baselines, you can save and restore models with simple agent.save and PPO.load functions. Stable-baselines does not have support for loading OpenAI baselines agents with a single call.

Also, we recommend using stable-baselines3 as it is more actively supported.

Miffyli added the question Further information is requested label Mar 13, 2022

araffin mentioned this issue Mar 16, 2022

Problem retraining PPO1 model and using Tensorflow with SB2 DLR-RM/stable-baselines3#823

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2 #1154

Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2 #1154

durantagre commented Mar 12, 2022

Miffyli commented Mar 13, 2022

Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2 #1154

Problem retraining PPO1 model and using Tensorflow with Stable Baselines 2 #1154

Comments

durantagre commented Mar 12, 2022

Miffyli commented Mar 13, 2022