When fine-tuning the ProSST model (tested both 2046 and 4096 versions) and Prime model, we encountered errors while using LoRA and IA3 fine-tuning methods. The error message is approximately:
File "/hpcfs/fhome/puchx/.cache/huggingface/modules/transformers_modules/modeling_prosst.py", line 868, in forward
inputs_embeds.masked_fill_(
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
Do you know what causes this error? Other fine-tuning methods (freeze, full, and ses-adapter) work without any issues.
Thanks!!
When fine-tuning the ProSST model (tested both 2046 and 4096 versions) and Prime model, we encountered errors while using LoRA and IA3 fine-tuning methods. The error message is approximately:
File "/hpcfs/fhome/puchx/.cache/huggingface/modules/transformers_modules/modeling_prosst.py", line 868, in forward
inputs_embeds.masked_fill_(
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
Do you know what causes this error? Other fine-tuning methods (freeze, full, and ses-adapter) work without any issues.
Thanks!!