Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: an illegal memory access was encountered #77

Open
BL-CX opened this issue Oct 22, 2024 · 3 comments
Open

RuntimeError: CUDA error: an illegal memory access was encountered #77

BL-CX opened this issue Oct 22, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@BL-CX
Copy link

BL-CX commented Oct 22, 2024

Importing module 'gym_38' (/home/blcx/Downloads/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_38.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/blcx/Downloads/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
PyTorch version 1.10.0+cu113
Device count 1
/home/blcx/Downloads/isaacgym/python/isaacgym/_bindings/src/gymtorch
Using /home/blcx/.cache/torch_extensions/py38_cu113 as PyTorch extensions root...
Emitting ninja build file /home/blcx/.cache/torch_extensions/py38_cu113/gymtorch/build.ninja...
Building extension module gymtorch...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module gymtorch...
Setting seed: 1
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
/home/blcx/anaconda3/envs/robotics_env/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
'train_cfg' provided -> Ignoring 'name=anymal_c_flat'
Actor MLP: Sequential(
(0): Linear(in_features=48, out_features=128, bias=True)
(1): ELU(alpha=1.0)
(2): Linear(in_features=128, out_features=64, bias=True)
(3): ELU(alpha=1.0)
(4): Linear(in_features=64, out_features=32, bias=True)
(5): ELU(alpha=1.0)
(6): Linear(in_features=32, out_features=12, bias=True)
)
Critic MLP: Sequential(
(0): Linear(in_features=48, out_features=128, bias=True)
(1): ELU(alpha=1.0)
(2): Linear(in_features=128, out_features=64, bias=True)
(3): ELU(alpha=1.0)
(4): Linear(in_features=64, out_features=32, bias=True)
(5): ELU(alpha=1.0)
(6): Linear(in_features=32, out_features=1, bias=True)
)
/home/blcx/anaconda3/envs/robotics_env/lib/python3.8/site-packages/torch/nn/modules/module.py:1102: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters(). (Triggered internally at ../aten/src/ATen/native/cudnn/RNN.cpp:925.)
return forward_call(*input, **kwargs)
PxgCudaDeviceMemoryAllocator fail to allocate memory 339738624 bytes!! Result = 2
/buildAgent/work/99bede84aa0a52c2/source/gpunarrowphase/src/PxgNarrowphaseCore.cpp (11310) : internal error : GPU compressContactStage1 fail to launch kernel stage 1!!

/buildAgent/work/99bede84aa0a52c2/source/gpunarrowphase/src/PxgNarrowphaseCore.cpp (11347) : internal error : GPU compressContactStage2 fail to launch kernel stage 1!!

[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 4202
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 4210
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 3480
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 3535
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysX.cpp: 6137
[Error] [carb.gym.plugin] Gym cuda error: an illegal memory access was encountered: ../../../source/plugins/carb/gym/impl/Gym/GymPhysXCuda.cu: 991
Traceback (most recent call last):
File "legged_gym/scripts/play.py", line 121, in
play(args)
File "legged_gym/scripts/play.py", line 58, in play
ppo_runner, train_cfg = task_registry.make_alg_runner(env=env, name=args.task, args=args, train_cfg=train_cfg)
File "/home/blcx/Downloads/legged_gym/legged_gym/utils/task_registry.py", line 147, in make_alg_runner
runner = OnPolicyRunner(env, train_cfg_dict, log_dir, device=args.rl_device)
File "/home/blcx/Downloads/rsl_rl-v1.0.2/rsl_rl/runners/on_policy_runner.py", line 81, in init
_, _ = self.env.reset()
File "/home/blcx/Downloads/legged_gym/legged_gym/envs/base/base_task.py", line 114, in reset
obs, privileged_obs, _, _, _ = self.step(torch.zeros(self.num_envs, self.num_actions, device=self.device, requires_grad=False))
File "/home/blcx/Downloads/legged_gym/legged_gym/envs/base/legged_robot.py", line 90, in step
self.torques = self._compute_torques(self.actions).view(self.torques.shape)
File "/home/blcx/Downloads/legged_gym/legged_gym/envs/anymal_c/anymal.py", line 75, in _compute_torques
self.sea_input[:, 0, 0] = (actions * self.cfg.control.action_scale + self.default_dof_pos - self.dof_pos).flatten()
RuntimeError: CUDA error: an illegal memory access was encountered

@BL-CX BL-CX added the bug Something isn't working label Oct 22, 2024
@Black0Moonlight
Copy link

I got this problem before, my solution is reducing the number of envs. I don't know the real reason, but instead of 8192, using 4096 can make it works without cuda error.

@BL-CX
Copy link
Author

BL-CX commented Oct 30, 2024

OK! Thank you very much!I also solved this problem by replacing the graphics card with a better one

@SURE3187774683
Copy link

can you tell me where to change the Env_num or batch_size please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants