Set up task system

The vision is essentially to have "tasks" that are defined by (1) their reward scales, (2) starting states/starting env, and (3) termination states. The environment then combines the tasks during training (presenting the robot with a random task (or smth more complicated)) in order to get one policy to be able to follow any of these tasks based on their commands.

The main point of this sort of system would be to make adding new tasks easy, and have reasonable defaults that make new tasks relatively likely to work. Generally, ~democratize "training a robot for a task". Another possible benefit is that a single policy that works on all these tasks might be easier to "finetune" for a new task faster. Unclear how well this would work for these network sizes.

E.g. of tasks:
- standing
- walking
- running
- jumping (reward: height of feet in jumps' apex*; termination: landing? time based?)
- standing back up (reward: similar to standing but stronger for base height. Gets pushed very strongly ever few seconds. start possibly on the ground, tbd. termination is time-based only, no collision/height termination)
- matching dataset of human movements/dances (reward: pos distances with "ground truth"*; termination: the dance ends)
- kicking ball in goal, or playing soccer more generally (reward: ball closer to goal*? worried about the robot breaking apart of its own strength if it kicks a ball too hard though, tbd)
- fighting other robot (possibly terrible idea, just comes to mind. Reward would be mainly standing and hitting the other robot on head or torso, and adversarial reward where you want the other robot to fall down)

*among other things

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Set up task system #28

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Set up task system #28

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions