Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update reward to avoid stomping when no movement is required #25

Open
henri123lemoine opened this issue Nov 11, 2024 · 3 comments
Open
Assignees

Comments

@henri123lemoine
Copy link
Collaborator

We need to move away from "one policy per task" and towards "one policy for all tasks". One first step would be fixing the current problem where, when the walking policy is given 0s as its cmd, it still moves a lot, just without advancing (stomps without moving in any direction). One idea here would be to have the reward that encourages low dof lin vel be scaled up when the command is close enough to 0.

@henri123lemoine henri123lemoine self-assigned this Nov 11, 2024
@jingxiangmo
Copy link
Collaborator

@budzianowski

@jingxiangmo
Copy link
Collaborator

yeah I think this is a really good idea.

@jingxiangmo
Copy link
Collaborator

there's a lot of refactoring to be done, @budzianowski do you know if Wesley is working on the min PPO implementation from scratch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants