MSAF-ACT: Multi-Scale Observation Encoding for Bimanual Manipulation

This repo contains the implementation of MSAF-ACT, together with 2 simulated environments: Transfer Cube and Bimanual Insertion. You can train and evaluate MSAF-ACT in sim.

Repo Structure

imitate_episodes.py Train and Evaluate ACT
policy.py An adaptor for ACT policy
detr Model definitions of ACT, modified from DETR
sim_env.py Mujoco + DM_Control environments with joint space control
ee_sim_env.py Mujoco + DM_Control environments with EE space control
scripted_policy.py Scripted policies for sim environments
constants.py Constants shared across files
utils.py Utils such as data loading and helper functions
visualize_episodes.py Save videos from a .hdf5 dataset

Installation

conda create -n aloha python=3.8.10
conda activate aloha
pip install torchvision
pip install torch(we use pytorch 2.1.0 cuda11.8)
pip install pyquaternion
pip install pyyaml
pip install rospkg
pip install pexpect
pip install mujoco==2.3.7
pip install dm_control==1.0.14
pip install opencv-python
pip install matplotlib
pip install einops
pip install packaging
pip install h5py
pip install ipython
cd act/detr && pip install -e .

Example Usages

To set up a new terminal, run:

conda activate aloha
cd <path to act repo>

Simulated experiments

We use sim_transfer_cube_scripted task in the examples below. Another option is sim_insertion_scripted. To generated 50 episodes of scripted data, run:

python3 record_sim_episodes.py \
--task_name sim_transfer_cube_scripted \
--dataset_dir <data save dir> \
--num_episodes 50

To can add the flag --onscreen_render to see real-time rendering. To visualize the episode after it is collected, run

python3 visualize_episodes.py --dataset_dir <data save dir> --episode_idx 0

To train ACT:

# Transfer Cube task
python3 imitate_episodes.py \
--task_name sim_transfer_cube_scripted \
--ckpt_dir <ckpt dir> \
--policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \
--num_epochs 2000  --lr 1e-5 \
--seed 0
--fusion_layer_num 1

fusion_layer_num=0 # original algorithm(without multi-scale features)
fusion_layer_num=1 # multi-scale fusion with 1/32 and 1/16 feature maps
fusion_layer_num=2 # multi-scale fusion with 1/32,1/16,1/8 feature maps
fusion_layer_num=3 # multi-scale fusion with 1/32,1/16,1/8,1/4 feature maps

To evaluate the policy, run the same command but add --eval. This loads the best validation checkpoint. To enable temporal ensembling, add flag --temporal_agg. Videos will be saved to <ckpt_dir> for each rollout. You can also add --onscreen_render to see real-time rendering during evaluation.

references

our code is modified based on the ACT code. thanks a lot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSAF-ACT: Multi-Scale Observation Encoding for Bimanual Manipulation

Repo Structure

Installation

Example Usages

Simulated experiments

references

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
detr		detr
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_env.yaml		conda_env.yaml
constants.py		constants.py
ee_sim_env.py		ee_sim_env.py
imitate_episodes.py		imitate_episodes.py
policy.py		policy.py
record_sim_episodes.py		record_sim_episodes.py
scripted_policy.py		scripted_policy.py
sim_env.py		sim_env.py
utils.py		utils.py
visualize_episodes.py		visualize_episodes.py

License

jingyue202205/MSAF-ACT

Folders and files

Latest commit

History

Repository files navigation

MSAF-ACT: Multi-Scale Observation Encoding for Bimanual Manipulation

Repo Structure

Installation

Example Usages

Simulated experiments

references

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages