- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.1k
FEAT Add sine-LoRA #2434 #2457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
FEAT Add sine-LoRA #2434 #2457
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for providing an early draft. This is already on the right track, although there are still some bits missing.
I would suggest to focus on making one thing work first (e.g., integration of SineLoraLinearVariant) so that you can write tests for it and then extending it to other layer types (e.g., Embedding).
For testing I'd suggest adding a test case to test_custom_models.py (copying one of the LoRA tests cases in TEST_CASES with use_sine_lora: True is sufficient for now) so that you have a broad number of test cases for you to execute. Say you added this line
("Vanilla MLP LoRA + SineLoRA", "MLP", LoraConfig, {"target_modules": ["lin0", "lin1"]}),you can then invoke the tests using
pytest -k SineLoRA tests
Feel free to ping if you have any questions :)
        
          
                src/peft/tuners/lora/layer.py
              
                Outdated
          
        
      | from .variants import SineLoraLinearVariant | ||
| return SineLoraLinearVariant() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should reference the (not yet existing) SineLoraEmbeddingVariant since we're in the Embedding class.
But this code is good for Linear.resolve_lora_variant :) You can use it there!
Effectively every class that overrides modules in the model (Linear, Embedding, Conv2d, ...) needs its own variant implementation and resolve_lora_variant implementation but we can keep it at Linear and Embedding for now if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I corrected. Please check my new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm can you check again? I don't see changes in this regard :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I git push again. I only implemented Linear and Embedding for now and is that ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that's totally fine. We can add support for convolutions later once Linear and Embedding work as expected.
        
          
                src/peft/tuners/lora/variants.py
              
                Outdated
          
        
      |  | ||
| class SineLoraLinearVariant(LoraVariant): | ||
| @staticmethod | ||
| def init(module: Linear, adapter_name:str) -> None: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def init(module: Linear, adapter_name:str) -> None: | |
| def init(module: Linear, adapter_name:str, **kwargs) -> None: | 
With PR #2455 now merged, init() receives all the parameters that update_layer receives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmm I did not use that and do you think that is ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, no.
Currently the tests do not work because of the changes necessary in Linear.__init__ and Embedding.__init__. Once the changes are in place you'll see that calls to init will complain about unexpected arguments passed to init(). That's because all the config args are passed to init and without the wildcard **kwargs you have to define them all (which we don't want, of course).
Also you need a place to set module.sinelora_scaling and module.sinelora_frequency. This is here, from the kwargs, e.g.
module.sinelora_frequency = kwargs['sinelora_frequency']For sinelora_scaling you need to check if kwargs['sinelora_scaling'] is None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great progress! Maybe I'm missing some changes but I didn't see an update regarding the Embedding vs Linear layer variants. Could it be you missed a git push?
        
          
                =3
              
                Outdated
          
        
      | Collecting oauthlib | ||
| Using cached oauthlib-3.2.2-py3-none-any.whl.metadata (7.5 kB) | ||
| Using cached oauthlib-3.2.2-py3-none-any.whl (151 kB) | ||
| Installing collected packages: oauthlib | ||
| Successfully installed oauthlib-3.2.2 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove :)
        
          
                src/peft/tuners/lora/layer.py
              
                Outdated
          
        
      | from .variants import SineLoraLinearVariant | ||
| return SineLoraLinearVariant() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm can you check again? I don't see changes in this regard :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, for Linear and Embedding this looks like it is almost ready.
Please remove the following files that were probably wrongly checked in:
- =3
- posters/.$Untitled Diagram.drawio.dtmp
- posters/Untitled Diagram.drawio
- posters/Untitled Diagram.drawio.pdf
Please run make style to fix any style and code issues.
I think it would be worthwhile to add a test that compares plain LoRA with SineLoRA and makes sure that the output is different (e.g., for extreme scaling values so that we don't have to train for long). You could create tests/test_lora_variant_sinelora.py and add such a test there. I think it would be quite useful for making sure that the code works as intended :)
To test the Embedding layer implementation you could add
("Embedding + transformers Conv1D 2 LoRA + SineLoRA", "EmbConv1D", LoraConfig, {"target_modules": ["emb"], "use_sinelora": True}),
to the custom models testcase you already added.
        
          
                src/peft/tuners/lora/layer.py
              
                Outdated
          
        
      | from .variants import SineLoraLinearVariant | ||
| return SineLoraLinearVariant() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that's totally fine. We can add support for convolutions later once Linear and Embedding work as expected.
| sinelora_scaling (`float`): | ||
| The scaling factor for the sine activation. If not specified, it will be set to the default value of sqrt(in_features). | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this value is optional, it should be marked as type Optional[float]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change type of sinelora_scaling here  to Optional[float] as it is defined in code.
        
          
                src/peft/tuners/lora/variants.py
              
                Outdated
          
        
      |  | ||
| class SineLoraLinearVariant(LoraVariant): | ||
| @staticmethod | ||
| def init(module: Linear, adapter_name:str) -> None: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, no.
Currently the tests do not work because of the changes necessary in Linear.__init__ and Embedding.__init__. Once the changes are in place you'll see that calls to init will complain about unexpected arguments passed to init(). That's because all the config args are passed to init and without the wildcard **kwargs you have to define them all (which we don't want, of course).
Also you need a place to set module.sinelora_scaling and module.sinelora_frequency. This is here, from the kwargs, e.g.
module.sinelora_frequency = kwargs['sinelora_frequency']For sinelora_scaling you need to check if kwargs['sinelora_scaling'] is None.
| @githubnemo Hi, I am still a bit confused how to add module.sinelora_frequency = kwargs['sinelora_frequency'] and sinelora_scaling. It returns key error when I put module.sinelora_frequency = kwargs['sinelora_frequency'] in class SineLoraLinearVariant(LoraVariant): | 
| Hi @yipingji Sorry for being unclear. I tried highlighting the issue in #2457 (comment). The  So you have to update the following functions with new parameters the same way you did to add the  
 For example, the change for  diff --git a/src/peft/tuners/lora/layer.py b/src/peft/tuners/lora/layer.py
index cb09608..efd8d6a 100644
--- a/src/peft/tuners/lora/layer.py
+++ b/src/peft/tuners/lora/layer.py
@@ -183,6 +183,8 @@ class LoraLayer(BaseTunerLayer):
         use_rslora,
         use_dora: bool = False,
         use_sinelora: bool = False,
+        sinelora_frequency = 200.0,
+        sinelora_scaling = None,
         lora_bias: bool = False,
     ):
         # collect the kwargs
@@ -574,6 +576,8 @@ class Linear(nn.Module, LoraLayer):
         use_dora: bool = False,
         lora_bias: bool = False,
         use_sinelora: bool = False,
+        sinelora_frequency: float = 200.0,
+        sinelora_scaling: Optional[float] = None,
         **kwargs,
     ) -> None:
         super().__init__()
@@ -591,6 +595,8 @@ class Linear(nn.Module, LoraLayer):
             use_dora=use_dora,
             lora_bias=lora_bias,
             use_sinelora=use_sinelora,
+            sinelora_frequency=sinelora_frequency,
+            sinelora_scaling=sinelora_scaling,
         )
         self.is_target_conv_1d_layer = is_target_conv_1d_layerI hope that helps! To test the  to the custom models testcase you already added. | 
| Hi @githubnemo, apologies for the previous update and it is my first time to PR in such a large codebase. I was just wondering if my current one is correct. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @githubnemo, apologies for the previous update and it is my first time to PR in such a large codebase. I was just wondering if my current one is correct.
No worries, you're doing fine :) Thanks for your work!
Please make sure to test your changes regularly by running pytest:
# from the root of the repository
pytest -k SineLoRA tests
I left some comments regarding the forward implementations.
I think we need to implement merge and unmerge as well but they should be basically equal to the way LoRA is doing it.
        
          
                src/peft/tuners/lora/variants.py
              
                Outdated
          
        
      | sine_output = ( | ||
| module._embed(x) | ||
| @ torch.sin(module.sinelora_frequency * lora_embedding_A.weight.T @ lora_embedding_B.weight.T) | ||
| / module.sinelora_scaling | ||
| * lora_scaling | ||
| ) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, is this correct? The way I see it, _embed takes to parameters, x and weight but we're only supplying x here. weight should probably be lora_embedding_A.T?
| result = result + sine_output | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're missing a return here.
| @yipingji gentle ping :) | 
| 
 Sorry for the late updates. I am working on conference recently and will PR soon;) | 
| This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. | 
| Hi @githubnemo , How can i reopen the PR | 
| @yipingji I've reopened the PR and added the WIP tag so stale bot will not bother us. | 
| @githubnemo I just updated and please have a look;) | 
| gentle ping @githubnemo ;) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! I think we're getting close on finishing this.
Very important: make sure to run make style and pytest -k SineLoRA tests before pushing changes and asking for a review. Both the tests and the linter are failing right now. You can use these tools to iterate on your side before asking for a review, that way the review will be a lot faster :)
I suggested to add a test case for the sinelora_scaling parameter, if it helps I encourage you to add more such test cases to remove the remaining bugs even faster!
| sinelora_scaling (`float`): | ||
| The scaling factor for the sine activation. If not specified, it will be set to the default value of sqrt(in_features). | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change type of sinelora_scaling here  to Optional[float] as it is defined in code.
| from .variants import DoraConv1dVariant | ||
| elif use_sinelora: | ||
| from .variants import SineLoraConv1dVariant | ||
| return None | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We're missing the return statements here so resolve_lora_variantalways returnsNone.
- There doesn't seem to exist a SineLoraConv1dVariantyet
If you have an implementation for conv*d I'd suggest adding it. If you don't maybe it is worthwhile to skip it for now and undo the changes in the Conv* layers.
| init_lora_weights, | ||
| use_rslora, | ||
| use_dora, | ||
| use_sinelora, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to change the update_layer call in ConvNd.__init__ as well (currently misses all sinelora arguments). But since we're skipping convolutions for now I suggest to remove it entirely.
| merged_weight = orig_weight + delta_weight | ||
| if not torch.isfinite(merged_weight).all(): | ||
| raise ValueError(f"NaNs detected in merged weights for adapter {active_adapter}") | ||
| module._cache_store(f"{active_adapter}-delta_weight", delta_weight) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to cache the delta weights?
The same goes for merge_unsafe and the Embedding implementation.
| def init(module: Embedding, adapter_name: str, **kwargs) -> None: | ||
| module.sinelora_frequency = kwargs["sinelora_frequency"] | ||
|  | ||
| sinelora_scaling = kwargs["sinelora_scaling"] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be self.sinelora_scaling = kwargs....
Please add a test case for this scenario, e.g.:
    ("Embedding + transformers Conv1D LoRA + SineLoRA 2", "EmbConv1D", LoraConfig, {"target_modules": ["emb"], "use_sinelora": True, "sinelora_scaling": 100}),
| class DoraEmbeddingLayer(DoraLinearLayer): | ||
| def forward(self, x, *, lora_A, lora_B, scaling, base_layer, embed_fn): | ||
| """ | ||
| For DoRA, calculate the extra output from LoRA with DoRA applied. This should be added on top of the base layer | ||
| output. | ||
| """ | ||
| lora_weight = (lora_A @ lora_B).T | ||
| magnitude = self.weight | ||
| weight = base_layer.weight | ||
| weight_norm = self.get_weight_norm(weight, lora_weight.detach(), scaling) | ||
| # see section 4.3 of DoRA (https://arxiv.org/abs/2402.09353) | ||
| # "[...] we suggest treating ||V +∆V ||_c in | ||
| # Eq. (5) as a constant, thereby detaching it from the gradient | ||
| # graph. This means that while ||V + ∆V ||_c dynamically | ||
| # reflects the updates of ∆V , it won’t receive any gradient | ||
| # during backpropagation" | ||
| weight_norm = weight_norm.detach() | ||
| mag_norm_scale = magnitude / weight_norm | ||
| result_dora = mag_norm_scale * (embed_fn(x, lora_A) @ lora_B) * scaling | ||
| return mag_norm_scale, result_dora | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this come from?
| Hey @yipingji, do you still plan on implementing this further? | 
| Yes I plan to implement this further but I’m quite busy with my project
recently and I will pr next month:)… On Wed, Jul 30, 2025 at 11:52 PM githubnemo ***@***.***> wrote:
 *githubnemo* left a comment (huggingface/peft#2457)
 <#2457 (comment)>
 Hey @yipingji <https://github.com/yipingji>, do you still plan on
 implementing this further?
 —
 Reply to this email directly, view it on GitHub
 <#2457 (comment)>,
 or unsubscribe
 <https://github.com/notifications/unsubscribe-auth/BI7NMWDNKHD3CFRDVWZVPJT3LDII3AVCNFSM6AAAAABZ4FA5C2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCMZWGU4DIOJUGI>
 .
 You are receiving this because you were mentioned.Message ID:
 ***@***.***>
 | 
* Add tests_latest.yml workflow file * don't check the branch * Fix workflow
Hi,
I just implemented sine LoRA in variants.py and changed "resolve_lora_variant" a bit in layers.py.