-
Notifications
You must be signed in to change notification settings - Fork 206
Go2 support #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Go2 support #110
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
So i've worked on the full collision mesh and examples, i have trained successfully Joystick, Handstand, Footstand and Getup. The policies need some rewards tuning but training works. Let me know if I need to do anything else. |
note: the actuator order in the mjx for Go2 does not follow the Unitree ordering for legs which is FR/FL/RR/RL, in the mjx it's FL/FR/RL/RR. just a note as simply forwarding the actions to the default order in LowCmd leads to mixing up the joints. Should this be fixed in the MJX? or it's an implementation detail left to the driver? |
Hello! Have you successfully trained Go2Getup and sim2realized it to a real robot? I found that a single training does not work like this: python train_jax_ppo.py --env_name=Go2Getup |
Yes I trained the Joystick policy and transfered it on a real Go2 successfully, the Getup and Handstand are straight copied from Go1, but from quick tests they did result in successful policies in sim, I didnt transfer these on the real Go2. |
for Getup there's an issue about it for Go1 so might have a look #65 |
That's the key point! It's mentioned in #65 that 50M timesteps is not enough for training Go1Getup. But should I train 750M timesteps at once, or train 50M timesteps each time and repeat loading checkpoints? More importantly, the paper mentions:
How should these two tricks be added to the training process? |
that's a bit off-topic regarding this PR, I'd suggest you ask directly in the issue itself since it's pretty much the same problem, the Go1/Go2 architectures are very similar. |
Hi @aatb-ch thanks for the PR! I'll try to get to this after the CoRL supplemental deadline (probs end of week). |
Thank you! I have reproduced it after 750M timesteps training. |
@kevinzakka super, yeah no stress just let me know once you got time if I need to change anything. |
Hi, https://github.com/DerSimi/unitree_go2_sim2real But note, when the Go2 is in low state mode, which is necessary for low level control, the "sportstatemode" is not published by the robot. This means that the linear velocity used as an observation here is not available. In my code, you can see that I circumvented this by setting the 'linvel' in the observation to the current command. It's a wonder it worked at all. |
even more, try zeroing out the linvel, gyro and gravity, and it still works! but yeah you dont get any state estimation from the internal sportmodestate, have to estimate it some other way. i did same as you initially but realized it's the same data as the command passed as observation anyway so it doesnt really matter to pass it again instead of linvel. |
@DerSimi May I know if you still have sim-to-real code available? |
This PR adds Unitree Go2 support, based off existing Go1 support. Used the Menagerie Go2 MJX model and adjusted accordingly to add correct sensors, collisions etc.
TODO: adjust full collision mjx, not 100% sure, seems some things are missing, have to go through the mesh of Go1 and compare, then test getup/handstand before adding these tasks.