-
-
Notifications
You must be signed in to change notification settings - Fork 763
Open
Labels
BugSomething isn't workingSomething isn't workingMedium Priority(will be worked on after all high priority issues)(will be worked on after all high priority issues)Medium RiskRisk of bugs in transformers and other librariesRisk of bugs in transformers and other librariesModel SupportRelated to a specific modeling situation.Related to a specific modeling situation.
Description
System Info
Windows 11 Home
x64
python 3.11.9
env:
accelerate==1.3.0
attrdict==2.0.1
bitsandbytes==0.45.1
certifi==2024.12.14
charset-normalizer==3.4.1
colorama==0.4.6
einops==0.8.0
filelock==3.13.1
fsspec==2024.6.1
huggingface-hub==0.28.0
idna==3.10
-e git+https://github.com/deepseek-ai/Janus.git@a74a59f8a9084b78c7760c955037503b9b55b862#egg=janus
Jinja2==3.1.4
MarkupSafe==2.1.5
mpmath==1.3.0
networkx==3.3
numpy==2.1.2
packaging==24.2
pillow==11.0.0
psutil==6.1.1
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.5.2
sentencepiece==0.2.0
six==1.17.0
sympy==1.13.1
timm==1.0.14
tokenizers==0.21.0
torch==2.6.0+cu124
torchaudio==2.6.0+cu124
torchvision==0.21.0+cu124
tqdm==4.67.1
transformers==4.48.1
typing_extensions==4.12.2
urllib3==2.3.0
Reproduction
Clone the Janus repo:
https://github.com/deepseek-ai/Janus . The rest of the examples are run from the root of that repo.
Download and run quantized model:
from transformers import AutoModelForCausalLM
from janus.models import MultiModalityCausalLM
model_path = "neilmehta24/janus-pro-7b-4bit"
vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
model_path, trust_remote_code=True
)
Error:
File "C:\Users\windo\neil\Janus\load_janus_pro.py", line 21, in <module>
vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\modeling_utils.py", line 4224, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\modeling_utils.py", line 4794, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\modeling_utils.py", line 875, in _load_state_dict_into_meta_model
hf_quantizer.create_quantized_param(model, param, param_name, param_device, state_dict, unexpected_keys)
File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\transformers\quantizers\quantizer_bnb_4bit.py", line 226, in create_quantized_param
new_value = bnb.nn.Params4bit.from_prequantized(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\bitsandbytes\nn\modules.py", line 280, in from_prequantized
self.quant_state = QuantState.from_dict(qs_dict=quantized_stats, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\windo\neil\Janus\.venv\Lib\site-packages\bitsandbytes\functional.py", line 756, in from_dict
raise ValueError(
ValueError: There should be exactly one `quant_state` item with ending from ['bitsandbytes__fp4', 'bitsandbytes__nf4'].
Detected ['aligner.layers.0.weight.quant_state.bitsandbytes__fp4', 'gen_aligner.layers.0.weight.quant_state.bitsandbytes__fp4'].
Here are the keys in qs_dict at the source of the error:
dict_keys(['aligner.layers.0.weight.absmax', 'aligner.layers.0.weight.quant_map', 'aligner.layers.0.weight.quant_state.bitsandbytes__fp4', 'gen_aligner.layers.0.weight.absmax', 'gen_aligner.layers.0.weight.quant_map', 'gen_aligner.layers.0.weight.quant_state.bitsandbytes__fp4'])
For reference, the original config.json:
https://huggingface.co/deepseek-ai/Janus-Pro-7B/blob/main/config.json
the quantized config.json:
https://huggingface.co/neilmehta24/janus-pro-7b-4bit/blob/main/config.json
the model upload code:
from transformers import AutoModelForCausalLM
from janus.models import MultiModalityCausalLM
model_path = "deepseek-ai/Janus-Pro-7B"
quantization_config = dict(load_in_4bit=True)
vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
model_path, trust_remote_code=True, quantization_config=quantization_config, torch_dtype="auto"
)
vl_gpt.push_to_hub("janus-pro-7b-4bit")
Notes:
- The model runs just fine when it's quantized without being saved.
- Seems like it's a shortcoming when writing the config file or validating the keys? Especially since the model runs just fine when quantized on-the-fly.
Expected behavior
I would expect the model to load, since it can load and quantize just fine when it has access to the full weights.
Metadata
Metadata
Assignees
Labels
BugSomething isn't workingSomething isn't workingMedium Priority(will be worked on after all high priority issues)(will be worked on after all high priority issues)Medium RiskRisk of bugs in transformers and other librariesRisk of bugs in transformers and other librariesModel SupportRelated to a specific modeling situation.Related to a specific modeling situation.