Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #29

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,11 @@ There are many commands to run. For each command, assume that:

Behavioral cloning baselines rely on infrustructure in our custom version of ManiSkill-Learn. Note that this is based on a very early version of ManiSkill-Learn, so it might be very different from the official code base now.

When the custom ManiSkill-Learn repo is installed correctly, we can then collect expert data for our behavioral cloning agents. Specifically, as indicated in the paper, we use an oracle ground-truth-flow-based policy to collect expert trajectories for BC. We run the following script to collect trajectories:
When the custom ManiSkill-Learn repo is installed correctly, we can then collect expert data for our behavioral cloning agents.

### BC

Specifically, as indicated in the paper, we use an oracle ground-truth-flow-based policy to collect expert trajectories for BC. We run the following script to collect trajectories:

```
python flowbot3d/grasping/agents/bc/bc_datagen_gt_flow_grasping.py
Expand All @@ -211,29 +215,25 @@ This would automatically generate point clouds data that the PointNet/Transforme
This script loads the demo data, trains the agent using the data, evaluates the agent and logs the numbers to a text file.


### BC

TODO

### DAgger E2E

TODO
Running BC baselines will log the trajectories data. Using the same procedure above, we can augment the original BC dataset by running the oracle policy on the new rollout trajectories. Following the same steps, we can train a DAgger E2E policy.

### DAgger Oracle

TODO
The only difference between DAgger E2E and DAgger Oracle is when to start running the BC-based policy. DAgger oracle uses the GT-flow-based oracle policy to select the contact point for the rollout. After the contact point is selected, we can then start to run the BC/DAgger policy. The switch of running the oracle policy is in the BC policy eval script.

### BC + F

TODO
This is again very similar to BC execpt in the PointNet-Transformer architecture provided in ManiSkill-Learn, we also add 3 extra channels to feed in the GT-flow at input. The above data-generation script also generates and logs ground-truth flows. We just need to switch the input to the network when training and evaluating this policy.

### DAgger E2E + F

TODO
Same as above, but with the DAgger E2E policy.

### DAgger Oracle + F

TODO
Same as above, but with the DAgger Oracle policy.

## Run ManiSkill Evals

Expand Down