rl_training
We've released the following tutorials for training and deploying a reinforcement learning policy. Please check it out!
- 【四足运控,从入门到精通】第一期 | 从头训练测试部署一个强化学习策略 (Bilibili)
- Quadruped Robot Control | From Beginner to Pro Episode 1 (YouTube)
rl_training is a RL training library for deeprobotics robots, based on IsaacLab. The table below lists all available environments:
Robot Model | Environment Name (ID) | Screenshot |
---|---|---|
Deeprobotics Lite3 | Rough-Deeprobotics-Lite3-v0 | ![]() |
Deeprobotics M20 | Rough-Deeprobotics-M20-v0 | ![]() |
Note
If you want to deploy policies in mujoco or real robots, please use the corresponding deploy repo in Deep Robotics Github Center.
-
Install Isaac Lab by following the installation guide. We recommend using the conda installation as it simplifies calling Python scripts from the terminal.
-
Clone this repository separately from the Isaac Lab installation (i.e. outside the
IsaacLab
directory):git clone https://github.com/DeepRoboticsLab/rl_training.git
-
Using a python interpreter that has Isaac Lab installed, install the library
python -m pip install -e source/rl_training
-
Verify that the extension is correctly installed by running the following command to print all the available environments in the extension:
python scripts/tools/list_envs.py
Setup as Omniverse Extension (Optional, click to expand)
We provide an example UI extension that will load upon enabling your extension defined in source/rl_training/rl_training/ui_extension_example.py
.
To enable your extension, follow these steps:
-
Add the search path of your repository to the extension manager:
- Navigate to the extension manager using
Window
->Extensions
. - Click on the Hamburger Icon (☰), then go to
Settings
. - In the
Extension Search Paths
, enter the absolute path torl_trainingb/source
- If not already present, in the
Extension Search Paths
, enter the path that leads to Isaac Lab's extension directory directory (IsaacLab/source
) - Click on the Hamburger Icon (☰), then click
Refresh
.
- Navigate to the extension manager using
-
Search and enable your extension:
- Find your extension under the
Third Party
category. - Toggle it to enable your extension.
- Find your extension under the
Deeprobotics Lite3:
# Train
python scripts/reinforcement_learning/rsl_rl/train.py --task=Rough-Deeprobotics-Lite3-v0 --headless
# Play
python scripts/reinforcement_learning/rsl_rl/play.py --task=Rough-Deeprobotics-Lite3-v0 --num_envs=10
Deeprobotics M20:
# Train
python scripts/reinforcement_learning/rsl_rl/train.py --task=Rough-Deeprobotics-M20-v0 --headless
# Play
python scripts/reinforcement_learning/rsl_rl/play.py --task=Rough-Deeprobotics-M20-v0 --num_envs=10
Note
If you want to control a SINGLE ROBOT with the keyboard during playback, add --keyboard
at the end of the play script.
Key bindings:
====================== ========================= ========================
Command Key (+ve axis) Key (-ve axis)
====================== ========================= ========================
Move along x-axis Numpad 8 / Arrow Up Numpad 2 / Arrow Down
Move along y-axis Numpad 4 / Arrow Right Numpad 6 / Arrow Left
Rotate along z-axis Numpad 7 / Z Numpad 9 / X
====================== ========================= ========================
-
You can change
Rough
toFlat
in the above configs. -
Record video of a trained agent (requires installing
ffmpeg
), add--video --video_length 200
-
Play/Train with 32 environments, add
--num_envs 32
-
Play on specific folder or checkpoint, add
--load_run run_folder_name --checkpoint model.pt
-
Resume training from folder or checkpoint, add
--resume --load_run run_folder_name --checkpoint model.pt
-
To train with multiple GPUs, use the following command, where --nproc_per_node represents the number of available GPUs:
python -m torch.distributed.run --nnodes=1 --nproc_per_node=2 scripts/reinforcement_learning/rsl_rl/train.py --task=<ENV_NAME> --headless python -m torch.distributed.run --nnodes=1 --nproc_per_node=2 scripts/reinforcement_learning/rsl_rl/train.py --task=Rough-Deeprobotics-Lite3-v0 --headless --distributed --num_envs=2048
-
Note: each gpu will have the same number of envs specified in the config, to use the previous total number of envs, devide it by the number of gpus.
-
To scale up training beyond multiple GPUs on a single machine, it is also possible to train across multiple nodes. To train across multiple nodes/machines, it is required to launch an individual process on each node.
For the master node, use the following command, where --nproc_per_node represents the number of available GPUs, and --nnodes represents the number of nodes:
python -m torch.distributed.run --nproc_per_node=2 --nnodes=2 --node_rank=0 --rdzv_id=123 --rdzv_backend=c10d --rdzv_endpoint=localhost:5555 scripts/reinforcement_learning/rsl_rl/train.py --task=<ENV_NAME> --headless --distributed
Note that the port (
5555
) can be replaced with any other available port. For non-master nodes, use the following command, replacing--node_rank
with the index of each machine:python -m torch.distributed.run --nproc_per_node=2 --nnodes=2 --node_rank=1 --rdzv_id=123 --rdzv_backend=c10d --rdzv_endpoint=ip_of_master_machine:5555 scripts/reinforcement_learning/rsl_rl/train.py --task=<ENV_NAME> --headless --distributed
To view tensorboard, run:
tensorboard --logdir=logs
In some VsCode versions, the indexing of part of the extensions is missing. In this case, add the path to your extension in .vscode/settings.json
under the key "python.analysis.extraPaths"
.
Note: Replace <path-to-isaac-lab>
with your own IsaacLab path.
{
"python.languageServer": "Pylance",
"python.analysis.extraPaths": [
"${workspaceFolder}/source/rl_training",
"/<path-to-isaac-lab>/source/isaaclab",
"/<path-to-isaac-lab>/source/isaaclab_assets",
"/<path-to-isaac-lab>/source/isaaclab_mimic",
"/<path-to-isaac-lab>/source/isaaclab_rl",
"/<path-to-isaac-lab>/source/isaaclab_tasks",
]
}
Temporary USD files are generated in /tmp/IsaacLab/usd_{date}_{time}_{random}
during simulation runs. These files can consume significant disk space and can be cleaned by:
rm -rf /tmp/IsaacLab/usd_*
The project uses some code from the following open-source code repositories: