Update reward to avoid stomping when no movement is required #25

henri123lemoine · 2024-11-11T22:45:13Z

We need to move away from "one policy per task" and towards "one policy for all tasks". One first step would be fixing the current problem where, when the walking policy is given 0s as its cmd, it still moves a lot, just without advancing (stomps without moving in any direction). One idea here would be to have the reward that encourages low dof lin vel be scaled up when the command is close enough to 0.

jingxiangmo · 2024-11-11T22:48:31Z

@budzianowski

jingxiangmo · 2024-11-11T22:49:12Z

yeah I think this is a really good idea.

jingxiangmo · 2024-11-11T22:49:46Z

there's a lot of refactoring to be done, @budzianowski do you know if Wesley is working on the min PPO implementation from scratch?

henri123lemoine self-assigned this Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update reward to avoid stomping when no movement is required #25

Update reward to avoid stomping when no movement is required #25

henri123lemoine commented Nov 11, 2024

jingxiangmo commented Nov 11, 2024

jingxiangmo commented Nov 11, 2024

jingxiangmo commented Nov 11, 2024

Update reward to avoid stomping when no movement is required #25

Update reward to avoid stomping when no movement is required #25

Comments

henri123lemoine commented Nov 11, 2024

jingxiangmo commented Nov 11, 2024

jingxiangmo commented Nov 11, 2024

jingxiangmo commented Nov 11, 2024