Skip to content

RuntimeError: Could not load libtorchcodec during lerobot/scripts/train.py script #964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks
shrutichakraborty opened this issue Apr 9, 2025 · 7 comments
Labels
question Requests for clarification or additional information

Comments

@shrutichakraborty
Copy link

shrutichakraborty commented Apr 9, 2025

System Info

- `lerobot` version: 0.1.0
- Platform: Linux-6.8.0-57-generic-x86_64-with-glibc2.35
- Python version: 3.10.13
- Huggingface_hub version: 0.29.3
- Dataset version: 3.4.1
- Numpy version: 1.26.4
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Cuda version: 12040


Additionally: 

ffmpeg version : 7.1.1
TorchCodec version : 0.2.1

Information

  • One of the scripts in the examples/ folder of LeRobot
  • My own task or dataset (give details below)

Reproduction

Install leRobot from the main documentation as follows :

conda create -n lerobot python=3.10 -y
conda activate lerobot
git clone https://github.com/huggingface/lerobot.git ~/lerobot
pip install --no-binary=av -e
pip install torchvision==0.20.1
conda install -c conda-forge 'ffmpeg>=7.0' -y

After collecting a dataset, run lerobot/scripts/train.py script

Expected behavior

Hello all!

I am getting started with the lerobot so100 arm and have had a few issues.

The first was the same as the issue in #883 in running the control_robot.py script which I solved (or bypassed) by following remi cadene's response to do pip install torchvision==0.20.1 and also conda install -c conda-forge 'ffmpeg>=7.0' -y after doing pip install --no-binary=av -e . This allowed me to successfully run the control_robot.py script successfully. However, then I tried to collect a dataset and run a training with the lerobot/scripts/train.py script and I encountered the following issue :

from torchcodec.decoders._core.video_decoder_ops import (
  File "/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/decoders/_core/video_decoder_ops.py", line 59, in <module>
    load_torchcodec_extension()
  File "/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/decoders/_core/video_decoder_ops.py", line 44, in load_torchcodec_extension
    raise RuntimeError(
RuntimeError: Could not load libtorchcodec. Likely causes:
          1. FFmpeg is not properly installed in your environment. We support
             versions 4, 5, 6 and 7.
          2. The PyTorch version (2.5.1+cu124) is not compatible with
             this version of TorchCodec. Refer to the version compatibility
             table:
             https://github.com/pytorch/torchcodec?tab=readme-ov-file#installing-torchcodec.
          3. Another runtime dependency; see exceptions below.
        The following exceptions were raised as we tried to load libtorchcodec:
        
[start of libtorchcodec loading traceback]
/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/libtorchcodec7.so: undefined symbol: _ZNK3c1011StorageImpl27throw_data_ptr_access_errorEv
libavutil.so.58: cannot open shared object file: No such file or directory
libavutil.so.57: cannot open shared object file: No such file or directory
/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/libtorchcodec4.so: undefined symbol: _ZNK3c1011StorageImpl27throw_data_ptr_access_errorEv
[end of libtorchcodec loading traceback].

It seems that I have some issues with the torchcodecand ffmpeg versions not being compatible. Checking their versions gives me:

ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
built with gcc 13.3.0 (conda-forge gcc 13.3.0-2)
configuration: --prefix=/home/moonshot/miniconda3/envs/lerobot --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-cc --cxx=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-c++ --nm=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-nm --ar=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-ar --disable-doc --enable-openssl --enable-demuxer=dash --enable-hardcoded-tables --enable-libfreetype --enable-libharfbuzz --enable-libfontconfig --enable-libopenh264 --enable-libdav1d --disable-gnutls --enable-libmp3lame --enable-libvpx --enable-libass --enable-pthreads --enable-alsa --enable-libpulse --enable-vaapi --enable-libopenvino --enable-gpl --enable-libx264 --enable-libx265 --enable-libaom --enable-libsvtav1 --enable-libxml2 --enable-pic --enable-shared --disable-static --enable-version3 --enable-zlib --enable-libvorbis --enable-libopus --enable-librsvg --enable-ffplay --pkg-config=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/pkg-config
libavutil      59. 39.100 / 59. 39.100
libavcodec     61. 19.101 / 61. 19.101
libavformat    61.  7.100 / 61.  7.100
libavdevice    61.  3.100 / 61.  3.100
libavfilter    10.  4.100 / 10.  4.100
libswscale      8.  3.100 /  8.  3.100
libswresample   5.  3.100 /  5.  3.100
libpostproc    58.  3.100 / 58.  3.100

And TorchCodec version 0.2.1.

Could anyone suggest the right versions to install and whether I should downgrade ffmpeg ?

@imstevenpmwork
Copy link
Collaborator

Try with latest main and new installation instructions (:

@imstevenpmwork imstevenpmwork added the question Requests for clarification or additional information label Apr 11, 2025
@gpthimble
Copy link

gpthimble commented Apr 13, 2025

I have the exact same issue. But I think it is due to some problem with the torchcodec.
When I debug this issue , I tried to start with a complete new environment, install torch and then ffmpeg and finally the torchcodec. And I didn't install the lerobot.
When I tried to run "import torchcodec" in python interpreter, the same error still occured , I have tried different ffmpeg version, no fix yet.

@CarolinePascal
Copy link
Contributor

Hi @gpthimble !

Can you confirm that you installed ffmpeg using conda during your second attempt ? This is actually a requirement to make torchcodec work correctly.
If so, could you provide us with the output of the following commands (which have to be run in your LeRobot conda environment) :

which ffmpeg
ffmpeg -version
ffmpeg -encoders | grep libsvtav1

Thanks,

Caroline.

@gpthimble
Copy link

Hi @gpthimble !

Can you confirm that you installed ffmpeg using conda during your second attempt ? This is actually a requirement to make torchcodec work correctly. If so, could you provide us with the output of the following commands (which have to be run in your LeRobot conda environment) :

which ffmpeg
ffmpeg -version
ffmpeg -encoders | grep libsvtav1

Thanks,

Caroline.

I have tried the new instructions in the readme, build a new enviroment, install the ffmpeg 7.1.1.
Everything works now! Thank you!

@gpthimble
Copy link

System Info

  • lerobot version: 0.1.0
  • Platform: Linux-6.8.0-57-generic-x86_64-with-glibc2.35
  • Python version: 3.10.13
  • Huggingface_hub version: 0.29.3
  • Dataset version: 3.4.1
  • Numpy version: 1.26.4
  • PyTorch version (GPU?): 2.5.1+cu124 (True)
  • Cuda version: 12040

Additionally:

ffmpeg version : 7.1.1
TorchCodec version : 0.2.1

Information

  • One of the scripts in the examples/ folder of LeRobot[ ] My own task or dataset (give details below)

Reproduction

Install leRobot from the main documentation as follows :

conda create -n lerobot python=3.10 -y conda activate lerobot git clone https://github.com/huggingface/lerobot.git ~/lerobot pip install --no-binary=av -e pip install torchvision==0.20.1 conda install -c conda-forge 'ffmpeg>=7.0' -y

After collecting a dataset, run lerobot/scripts/train.py script

Expected behavior

Hello all!

I am getting started with the lerobot so100 arm and have had a few issues.

The first was the same as the issue in #883 in running the control_robot.py script which I solved (or bypassed) by following remi cadene's response to do pip install torchvision==0.20.1 and also conda install -c conda-forge 'ffmpeg>=7.0' -y after doing pip install --no-binary=av -e . This allowed me to successfully run the control_robot.py script successfully. However, then I tried to collect a dataset and run a training with the lerobot/scripts/train.py script and I encountered the following issue :

from torchcodec.decoders._core.video_decoder_ops import (
  File "/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/decoders/_core/video_decoder_ops.py", line 59, in <module>
    load_torchcodec_extension()
  File "/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/decoders/_core/video_decoder_ops.py", line 44, in load_torchcodec_extension
    raise RuntimeError(
RuntimeError: Could not load libtorchcodec. Likely causes:
          1. FFmpeg is not properly installed in your environment. We support
             versions 4, 5, 6 and 7.
          2. The PyTorch version (2.5.1+cu124) is not compatible with
             this version of TorchCodec. Refer to the version compatibility
             table:
             https://github.com/pytorch/torchcodec?tab=readme-ov-file#installing-torchcodec.
          3. Another runtime dependency; see exceptions below.
        The following exceptions were raised as we tried to load libtorchcodec:
        
[start of libtorchcodec loading traceback]
/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/libtorchcodec7.so: undefined symbol: _ZNK3c1011StorageImpl27throw_data_ptr_access_errorEv
libavutil.so.58: cannot open shared object file: No such file or directory
libavutil.so.57: cannot open shared object file: No such file or directory
/home/moonshot/miniconda3/envs/lerobot/lib/python3.10/site-packages/torchcodec/libtorchcodec4.so: undefined symbol: _ZNK3c1011StorageImpl27throw_data_ptr_access_errorEv
[end of libtorchcodec loading traceback].

It seems that I have some issues with the torchcodecand ffmpeg versions not being compatible. Checking their versions gives me:

ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
built with gcc 13.3.0 (conda-forge gcc 13.3.0-2)
configuration: --prefix=/home/moonshot/miniconda3/envs/lerobot --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-cc --cxx=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-c++ --nm=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-nm --ar=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/x86_64-conda-linux-gnu-ar --disable-doc --enable-openssl --enable-demuxer=dash --enable-hardcoded-tables --enable-libfreetype --enable-libharfbuzz --enable-libfontconfig --enable-libopenh264 --enable-libdav1d --disable-gnutls --enable-libmp3lame --enable-libvpx --enable-libass --enable-pthreads --enable-alsa --enable-libpulse --enable-vaapi --enable-libopenvino --enable-gpl --enable-libx264 --enable-libx265 --enable-libaom --enable-libsvtav1 --enable-libxml2 --enable-pic --enable-shared --disable-static --enable-version3 --enable-zlib --enable-libvorbis --enable-libopus --enable-librsvg --enable-ffplay --pkg-config=/home/conda/feedstock_root/build_artifacts/ffmpeg_1741820412024/_build_env/bin/pkg-config
libavutil      59. 39.100 / 59. 39.100
libavcodec     61. 19.101 / 61. 19.101
libavformat    61.  7.100 / 61.  7.100
libavdevice    61.  3.100 / 61.  3.100
libavfilter    10.  4.100 / 10.  4.100
libswscale      8.  3.100 /  8.  3.100
libswresample   5.  3.100 /  5.  3.100
libpostproc    58.  3.100 / 58.  3.100

And TorchCodec version 0.2.1.

Could anyone suggest the right versions to install and whether I should downgrade ffmpeg ?

I think you should try different version of Torchcodec. version 0.2.1 is for the torch version 2.6 not 2.5.
Check this link: https://github.com/pytorch/torchcodec?tab=readme-ov-file#installing-torchcodec

@CarolinePascal
Copy link
Contributor

Thanks for pointing that out !

We're going to solve all ffmpeg and torchcodec related versioning issues very soon, and I think we'll update our pyproject.toml by then ;)

I'm closing the issue as it is solved now.

Best,

Caroline.

@gpthimble
Copy link

gpthimble commented Apr 15, 2025

Thanks for pointing that out !

We're going to solve all ffmpeg and torchcodec related versioning issues very soon, and I think we'll update our pyproject.toml by then ;)

I'm closing the issue as it is solved now.

Best,

Caroline.

I have also tried lerobot project on an AMD GPU with rocm, with no luck. I think the torch itself should work without problem, but it seems that the package torchcodec is not compactable with rocm version of torch. I will post an issue on the torchcodec repository about that. Really hope this can work on an amd gpu!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Requests for clarification or additional information
Projects
None yet
Development

No branches or pull requests

4 participants