GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

chokoon123 · 2025-02-20T14:51:53Z

im try to convert this ggml to gguf but i got this error .thank you

python convert_llama_ggml_to_gguf.py --input "D:\nectec\model\llama-2-13b-chat.ggmlv3.q2_K.bin" --output "D:\nectec\model\llama-2-13b-chat.gguf"
INFO:ggml-to-gguf:* Using config: Namespace(input=WindowsPath('D:/nectec/model/llama-2-13b-chat.ggmlv3.q2_K.bin'), output=WindowsPath('D:/nectec/model/llama-2-13b-chat.gguf'), name=None, desc=None, gqa=8, eps='0', context_length=2048, model_metadata_dir=None, vocab_dir=None, vocabtype='spm,hfft', verbose=False)
WARNING:ggml-to-gguf:=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===
INFO:ggml-to-gguf:* Scanning GGML input file
INFO:ggml-to-gguf:* File format: GGJTv3 with ftype MOSTLY_Q2_K
INFO:ggml-to-gguf:* GGML model hyperparameters: <Hyperparameters: n_vocab=32000, n_embd=5120, n_mult=256, n_head=40, n_layer=40, n_rot=128, n_ff=13824, ftype=MOSTLY_Q2_K>
WARNING:ggml-to-gguf:
=== WARNING === Special tokens may not be converted correctly. Use --model-metadata-dir if possible === WARNING ===

INFO:ggml-to-gguf:- Guessed n_kv_head = 5 based on GQA 8
INFO:ggml-to-gguf:* Preparing to save GGUF file
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:ggml-to-gguf:* Adding model parameters and KV items
INFO:ggml-to-gguf:* Adding 32000 vocab item(s)
INFO:ggml-to-gguf:* Adding 363 tensor(s)
Traceback (most recent call last):
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 450, in
main()
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 445, in main
converter.save()
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 238, in save
self.add_tensors(gguf_writer)
File "D:\nectec\model\New folder\llama.cpp\convert_llama_ggml_to_gguf.py", line 353, in add_tensors
gguf_writer.add_tensor(
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 381, in add_tensor
self.add_tensor_info(name, shape, tensor.dtype, tensor.nbytes, raw_dtype=raw_dtype)
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\gguf_writer.py", line 354, in add_tensor_info
tensor_shape = quant_shape_from_byte_shape(tensor_shape, raw_dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\nectec\model\New folder\llama.cpp\gguf-py\gguf\quants.py", line 24, in quant_shape_from_byte_shape
raise ValueError(f"Quantized tensor bytes per row ({shape[-1]}) is not a multiple of {quant_type.name} type size ({type_size})")
ValueError: Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)

chokoon123 changed the title ~~GGML to GGUF Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84)~~ GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

chokoon123 commented Feb 20, 2025 •

edited

Loading

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

GGML to GGUF FAIL Quantized tensor bytes per row (5120) is not a multiple of Q2_K type size (84) #11976

Comments

chokoon123 commented Feb 20, 2025 • edited Loading

chokoon123 commented Feb 20, 2025 •

edited

Loading