Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up task system #28

Open
henri123lemoine opened this issue Nov 20, 2024 · 1 comment
Open

Set up task system #28

henri123lemoine opened this issue Nov 20, 2024 · 1 comment

Comments

@henri123lemoine
Copy link
Collaborator

The vision is essentially to have "tasks" that are defined by (1) their reward scales, (2) starting states/starting env, and (3) termination states. The environment then combines the tasks during training (presenting the robot with a random task (or smth more complicated)) in order to get one policy to be able to follow any of these tasks based on their commands.

The main point of this sort of system would be to make adding new tasks easy, and have reasonable defaults that make new tasks relatively likely to work. Generally, ~democratize "training a robot for a task". Another possible benefit is that a single policy that works on all these tasks might be easier to "finetune" for a new task faster. Unclear how well this would work for these network sizes.

E.g. of tasks:

  • standing
  • walking
  • running
  • jumping (reward: height of feet in jumps' apex*; termination: landing? time based?)
  • standing back up (reward: similar to standing but stronger for base height. Gets pushed very strongly ever few seconds. start possibly on the ground, tbd. termination is time-based only, no collision/height termination)
  • matching dataset of human movements/dances (reward: pos distances with "ground truth"*; termination: the dance ends)
  • kicking ball in goal, or playing soccer more generally (reward: ball closer to goal*? worried about the robot breaking apart of its own strength if it kicks a ball too hard though, tbd)
  • fighting other robot (possibly terrible idea, just comes to mind. Reward would be mainly standing and hitting the other robot on head or torso, and adversarial reward where you want the other robot to fall down)

*among other things

@henri123lemoine
Copy link
Collaborator Author

(One policy per task may be better, for outcome interpretability reasons)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant