You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to deploy the model with vllm, I get the following error: OSError: Consistency check failed: file should be of size 4966157056 but has size 4966842368 (model-00014-of-00030.safetensors).
I have tried to set FORCE_DOWNLOAD and HF_HUB_ENABLE_HF_TRANSFER to true and have enough space for the model (only uses about 25% of available space).
Hi @ppatel-eng , sorry for the inconvenience. This is usually due to a network issue while downloading. We don't use any FORCE_DOWNLOAD environment variable on our side and it seems that vllm don't expose either force_download argument of snapshot_download.
Can you try downloading the file that fails with :
Hi @ppatel-eng , sorry for the inconvenience. This is usually due to a network issue while downloading. We don't use any FORCE_DOWNLOAD environment variable on our side and it seems that vllm don't expose either force_download argument of snapshot_download. Can you try downloading the file that fails with :
pip install huggingface-hub[cli]
huggingface-cli download meta-llama/Llama-3.3-70B-Instruct
"model-00014-of-00030.safetensors"
--local-dir /data
--force-download
you can also retry on a different network. let us know if you get the same error again on the same file. thank you!
same issue, huggingface-cli commands works, but how can i deploy this model with vllm? (safetensors format), thank you!
Describe the bug
When trying to deploy the model with vllm, I get the following error:
OSError: Consistency check failed: file should be of size 4966157056 but has size 4966842368 (model-00014-of-00030.safetensors).
I have tried to set
FORCE_DOWNLOAD
andHF_HUB_ENABLE_HF_TRANSFER
to true and have enough space for the model (only uses about 25% of available space).Reproduction
vllm serve meta-llama/Llama-3.3-70B-Instruct
args:
- --download-dir
- /data
- --max-model-len
- "65536"
- --max-logprobs
- "5"
- --trust-remote-code
- --disable-log-requests
- --use-v2-block-manager
- --enforce-eager
- --tensor-parallel-size
- "4"
Logs
System info
The text was updated successfully, but these errors were encountered: