Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Explicitly load checkpoint to CPU to avoid CUDA error #219

Merged
merged 1 commit into from
Mar 3, 2025

Conversation

gau-nernst
Copy link
Contributor

The checkpoint was serialized directly from CUDA tensors. Hence, by default, torch.load() will attempt to reload the weights to CUDA, even if CUDA is not available e.g. CPU-only machines.

Without this fix, on my macbook, I'm getting this error

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

This PR fixes it.

@gau-nernst gau-nernst requested a review from tuanlda78202 March 3, 2025 03:54
Copy link
Contributor

@tuanlda78202 tuanlda78202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@gau-nernst gau-nernst merged commit 02e2f79 into main Mar 3, 2025
@gau-nernst gau-nernst deleted the fix/load_model_on_cpu branch March 3, 2025 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants