Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

Open
chokoon123 opened this issue Feb 20, 2025 · 0 comments

Comments

@chokoon123
Copy link

chokoon123 commented Feb 20, 2025

im try to convert this ggml to gguf but i got this error .thank you

python convert_llama_ggml_to_gguf.py --input "D:\nectec\model\llama-2-13b-chat.ggmlv3.q2_K.bin" --output "D:\nectec\model\llama-2-13b-chat.gguf"
INFO:ggml-to-gguf:* Using config: Namespace(input=WindowsPath('D:/nectec/model/llama-2-13b-chat.ggmlv3.q2_K.bin'), output=WindowsPath('D:/nectec/model/llama-2-13b-chat.gguf'), name=None, desc=None, gqa=8, eps='0', context_length=2048, model_metadata_dir=None, vocab_dir=None, vocabtype='spm,hfft', verbose=False)
WARNING:ggml-to-gguf:=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===
INFO:ggml-to-gguf:* Scanning GGML input file
INFO:ggml-to-gguf:* File format: GGJTv3 with ftype MOSTLY_Q2_K
INFO:ggml-to-gguf:* GGML model hyperparameters: <Hyperparameters: n_vocab=32000, n_embd=5120, n_mult=256, n_head=40, n_layer=40, n_rot=128, n_ff=13824, ftype=MOSTLY_Q2_K>
WARNING:ggml-to-gguf:
=== WARNING === Special tokens may not be converted correctly. Use --model-metadata-dir if possible === WARNING ===

INFO:ggml-to-gguf:- Guessed n_kv_head = 5 based on GQA 8
INFO:ggml-to-gguf:* Preparing to save GGUF file
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:ggml-to-gguf:* Adding model parameters and KV items
INFO:ggml-to-gguf:* Adding 32000 vocab item(s)
INFO:ggml-to-gguf:* Adding 363 tensor(s)
Traceback (most recent call last):
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 450, in
main()
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 445, in main
converter.save()
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 238, in save
self.add_tensors(gguf_writer)
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 353, in add_tensors
gguf_writer.add_tensor(
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 381, in add_tensor
self.add_tensor_info(name, shape, tensor.dtype, tensor.nbytes, raw_dtype=raw_dtype)
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 354, in add_tensor_info
tensor_shape = quant_shape_from_byte_shape(tensor_shape, raw_dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\quants.py", line 24, in quant_shape_from_byte_shape
raise ValueError(f"Quantized tensor bytes per row ({shape[-1]}) is not a multiple of {quant_type.name} type size ({type_size})")
ValueError: Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)

@chokoon123 chokoon123 changed the title GGML to GGUF Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant