Description
When I load model I have this error.
Traceback (most recent call last):
File "", line 1, in
File "test/env/lib/python3.9/site-packages/galai/init.py", line 39, in load_model
model._load_checkpoint(checkpoint_path=get_checkpoint_path(name))
File "test/env/lib/python3.9/site-packages/galai/model.py", line 63, in _load_checkpoint
load_checkpoint_and_dispatch(
File "test/env/lib/python3.9/site-packages/accelerate/big_modeling.py", line 366, in load_checkpoint_and_dispatch
load_checkpoint_in_model(
File "test/env/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 701, in load_checkpoint_in_model
set_module_tensor_to_device(model, param_name, param_device, value=param)
File "test/env/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 124, in set_module_tensor_to_device
new_value = value.to(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.