-
-
Notifications
You must be signed in to change notification settings - Fork 936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapter mismatch when merging #2277
Comments
Hey, thanks for the report. Could you provide more details, such as which model? How did you run it? A sample config would also help.
This seems like the checkpoint has less tokens than the current model. Are you pointing to the right adapter? |
Hi there, I'm encountering the same problem. It seems that Axolotl is resizing the embedding layer during training for some reason.
Model is Qwen/Qwen2.5-14B-Instruct and I am using the tokenizer default chat template (not adding any new special tokens), and am not targeting the embeddings/lm_head layers.
Using for example: (aka, it's definitely the same adapter that was just trained on the same config). Edit: I checked my training history and it looks like I was able to train and merge this same model base successfully around January 9-10th. |
@laurenhall , thanks for the report. I did a run for both qlora & lora on For reference, I'm using https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/qwen/qlora.yml as base, just changing |
Please check that this issue hasn't been reported before.
Expected Behavior
When merging adapter with base model, the token embedding size should match naturally
Current behaviour
Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([151666, 2048]) from checkpoint, the shape in current model is torch.Size([151936, 2048]).
size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([151666, 2048]) from checkpoint, the shape in current model is torch.Size([151936, 2048]).
After the recent update (not sure on which), there is a mismatch on certain adapters when merging with base/instruct model. This did not happen previously.
Steps to reproduce
Config yaml
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.11
axolotl branch-commit
main
Acknowledgements
The text was updated successfully, but these errors were encountered: