Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapter weights fail to be saved #6

Open
ink1 opened this issue May 15, 2023 · 2 comments
Open

adapter weights fail to be saved #6

ink1 opened this issue May 15, 2023 · 2 comments

Comments

@ink1
Copy link

ink1 commented May 15, 2023

I'm running finetuning on llama-13b "llmtune finetune --model llama-13b-4bit". Things seem to be working but at the end I get
Traceback (most recent call last):
File "/home/xxx/miniconda3/envs/llmtune/bin/llmtune", line 33, in
sys.exit(load_entry_point('llmtune==0.1.0', 'console_scripts', 'llmtune')())
File "/home/xxx/miniconda3/envs/llmtune/lib/python3.10/site-packages/llmtune-0.1.0-py3.10.egg/llmtune/run.py", line 101, in main
File "/home/xxx/miniconda3/envs/llmtune/lib/python3.10/site-packages/llmtune-0.1.0-py3.10.egg/llmtune/run.py", line 147, in finetune
File "/home/xxx/miniconda3/envs/llmtune/lib/python3.10/site-packages/llmtune-0.1.0-py3.10.egg/llmtune/executor.py", line 131, in finetune
AttributeError: 'Finetune4bConfig' object has no attribute 'adapter'

Which is probably why I get neither adapter_model.bin nor adapter_config.json in the end.
Funny enough, checkpoints are created but the adapter weights don't seem to be there. What's the point of checkpointing then?

I'm on py310_cu118 and nightly torch but other then that per requirements.txt

@ink1
Copy link
Author

ink1 commented May 15, 2023

@kuleshov
You probably meant to have

model.save_pretrained(tune_config.lora_out_dir)

instead of
https://github.com/kuleshov-group/llmtune/blob/7d69254eff754b78db1aef9ea8268003a1629333/llmtune/executor.py#L131

Do you need a PR?

Also it would make more sense to set

save_strategy="no"

instead of
https://github.com/kuleshov-group/llmtune/blob/7d69254eff754b78db1aef9ea8268003a1629333/llmtune/executor.py#L98
What it is the point of doing such aggressive checkpointing by default especially as you commented out resume.

@wyklq
Copy link

wyklq commented May 19, 2023

I did the same patch, and get the model successfully saved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants