Adapter mismatch when merging #2277

teachsheryl · 2025-01-22T15:23:02Z

Please check that this issue hasn't been reported before.

I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

When merging adapter with base model, the token embedding size should match naturally

Current behaviour

Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([151666, 2048]) from checkpoint, the shape in current model is torch.Size([151936, 2048]).
size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([151666, 2048]) from checkpoint, the shape in current model is torch.Size([151936, 2048]).

After the recent update (not sure on which), there is a mismatch on certain adapters when merging with base/instruct model. This did not happen previously.

Steps to reproduce

Config yaml

Possible solution

No response

Which Operating Systems are you using?

Linux
macOS
Windows

Python Version

3.11

axolotl branch-commit

main

Acknowledgements

My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this bug has not been reported yet.
I am using the latest version of axolotl.
I have provided enough information for the maintainers to reproduce and diagnose the issue.

NanoCode012 · 2025-01-23T08:50:49Z

Hey, thanks for the report. Could you provide more details, such as which model? How did you run it? A sample config would also help.

copying a param with shape torch.Size([151666, 2048]) from checkpoint, the shape in current model is torch.Size([151936, 2048]).

This seems like the checkpoint has less tokens than the current model. Are you pointing to the right adapter?

laurenhall · 2025-01-27T22:02:59Z

Hi there, I'm encountering the same problem. It seems that Axolotl is resizing the embedding layer during training for some reason.

/usr/local/lib/python3.11/dist-packages/peft/utils/save_and_load.py:260: UserWarning: Setting `save_embedding_layers` to `True` as the embedding layer has been resized during finetuning.

Model is Qwen/Qwen2.5-14B-Instruct and I am using the tokenizer default chat template (not adding any new special tokens), and am not targeting the embeddings/lm_head layers.

# LoRA
adapter: qlora
lora_model_dir:
lora_r: 32
lora_alpha: 64
lora_dropout: 0.125
lora_target_linear: 
lora_fan_in_fan_out:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
lora_modules_to_save:

Using for example:
accelerate launch -m axolotl.cli.train my-qwen-test.yml
python -m axolotl.cli.merge_lora my-qwen-test.yml

(aka, it's definitely the same adapter that was just trained on the same config).
Produces a similar size mismatch between the adapter and the model it was just trained on.

Edit: I checked my training history and it looks like I was able to train and merge this same model base successfully around January 9-10th.

NanoCode012 · 2025-01-28T06:11:31Z

@laurenhall , thanks for the report. I did a run for both qlora & lora on Qwen/Qwen2.5-7B-Instruct and was able to train+merge successfully. Is there more info you could provide?

For reference, I'm using https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/qwen/qlora.yml as base, just changing base_model.

teachsheryl added the bug Something isn't working label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapter mismatch when merging #2277

Adapter mismatch when merging #2277

teachsheryl commented Jan 22, 2025

NanoCode012 commented Jan 23, 2025

laurenhall commented Jan 27, 2025 •

edited

Loading

NanoCode012 commented Jan 28, 2025

Adapter mismatch when merging #2277

Adapter mismatch when merging #2277

Comments

teachsheryl commented Jan 22, 2025

Please check that this issue hasn't been reported before.

Expected Behavior

Current behaviour

Steps to reproduce

Config yaml

Possible solution

Which Operating Systems are you using?

Python Version

axolotl branch-commit

Acknowledgements

NanoCode012 commented Jan 23, 2025

laurenhall commented Jan 27, 2025 • edited Loading

NanoCode012 commented Jan 28, 2025

laurenhall commented Jan 27, 2025 •

edited

Loading