-
Notifications
You must be signed in to change notification settings - Fork 2.1k
FEAT add GraLoRA #2851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
FEAT add GraLoRA #2851
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing GraLoRA to PEFT. The method looks interesting and the implementation generally looks good.
I have added a bunch of comments, but many of these are just due to your fork being a bit older. We have simplified PEFT now so that you can remove a bunch of code, I have marked the code that can be deleted.
Apart from the comments that I added, to complete this PR, let's work on:
- Extend tests: Add tests to
test_custom_models.py,test_encoder_decoder_models.py,test_feature_extraction_models.py, andtest_seq_classifier.py - Also, let's add documentation and ideally also at least one example.
- Optional, but highly recommended: Add an experiment to our PEFT method comparison suite.
src/peft/tuners/gralora/config.py
Outdated
|
|
||
| @dataclass | ||
| class GraloraConfig(PeftConfig): | ||
| r: int = field(default=8, metadata={"help": "gralora attention dimension"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is not helpful IMO, please give a more complete description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve extended the description for better clarity.
src/peft/tuners/gralora/config.py
Outdated
| ) | ||
| }, | ||
| ) | ||
| gralora_alpha: int = field(default=8, metadata={"help": "gralora alpha"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, let's extend the description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve extended the description.
src/peft/tuners/gralora/config.py
Outdated
| ) | ||
| gralora_alpha: int = field(default=8, metadata={"help": "gralora alpha"}) | ||
| gralora_dropout: float = field(default=0.0, metadata={"help": "gralora dropout"}) | ||
| gralora_k: int = field(default=2, metadata={"help": "gralora k"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, let's extend the description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve extended the description.
src/peft/tuners/gralora/config.py
Outdated
| class GraloraConfig(PeftConfig): | ||
| r: int = field(default=8, metadata={"help": "gralora attention dimension"}) | ||
| hybrid_r: int = field( | ||
| default=0, metadata={"help": "hybrid_r is the rank allocated to vanilla LoRA method when using Hybrid GraLoRA"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, by passing a value > 0, hybrid GraLoRA is enabled. Let's make that clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve extended the description, making it more clear.
src/peft/tuners/gralora/layer.py
Outdated
| self.scaling = {} | ||
| self.gralora_dropout = nn.ModuleDict({}) | ||
|
|
||
| # Set to `None` otherwise to avoid computation with random weight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the wrong code comment.
src/peft/tuners/gralora/model.py
Outdated
| config_dict[key] = config | ||
| return config_dict | ||
|
|
||
| def _set_adapter_layers(self, enabled=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as parent class, can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve removed the inherited methods from the parent class.
src/peft/tuners/gralora/model.py
Outdated
| if isinstance(module, (BaseTunerLayer, ModulesToSaveWrapper)): | ||
| module.enable_adapters(enabled) | ||
|
|
||
| def enable_adapter_layers(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as parent class, can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve removed the inherited methods from the parent class.
src/peft/tuners/gralora/model.py
Outdated
| def enable_adapter_layers(self): | ||
| self._set_adapter_layers(enabled=True) | ||
|
|
||
| def disable_adapter_layers(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as parent class, can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve removed the inherited methods from the parent class.
src/peft/tuners/gralora/model.py
Outdated
|
|
||
| self.active_adapter = new_adapter or [] | ||
|
|
||
| def merge_and_unload( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as parent class, can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve removed the inherited methods from the parent class.
tests/test_gralora.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for very diligently adding unit tests for GraLoRA. However, I'm afraid that we can delete the majority of them :)
Most of what is tested here is already covered in existing tests (merging, disabling adapters, different dtypes, ...). This is especially true for the tests in test_custom_models.py, so please add the GraLoRA settings there (once without and once with hybrid) and delete them here.
There are some tests that check for errors or warnings at initialization time. Those should be moved to test_initialization.py. Check the tests there for inspiration
Some other tests, like checking the shape of parameters, are IMHO too granular and can be deleted without replacement. If the shapes are wrong, we would notice in the forward pass for instance.
There are a few tests that are not covered by existing tests and are not necessarily too granular, like
test_gralora_hybrid_forward_computation, test_gralora_information_exchange_via_permutation, and test_gralora_scaling_factor. Personally, I'd be fine with removing those too but keeping them would be fine as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, thank you for your kind review. I’ve added the GraLoRA test code to test_custom_models.py.
You can run it using:
pytest tests/test_custom_models.py -k "Gralora"
Just in case, I’ve kept the test_gralora.py file to test GraLoRA’s unique features.
134e6f0 to
a24d156
Compare
|
@yeonjoon-jung01 Please ping me when you're finished so that I know that I can give this another review. Also, if possible, please avoid force pushes or rebases, as those make reviews harder. |
…hts parameter for flexible initialization
…ight calculation.
c53ffce to
2618a8a
Compare
…ce, and more intuitive hybrid_r handling.
2618a8a to
dec25f5
Compare
@BenjaminBossan I’ve finished updating the code 🙂. I saw your message a bit late — I had already rebased the branch to sync with the main stream, just in case there might be any conflicts. I’ll make sure to avoid force pushes or rebases from now on. Sorry about that! |
|
@BenjaminBossan I have also resolved the previously missed features. I’ve extended the test coverage to include |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates to the PR. I did another review round, please check.
Also, before committing your changes, please call make style. Ensure that you have the correct version of ruff installed (0.12.12).
| @@ -0,0 +1,32 @@ | |||
| # GraLoRA | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add an entry to the toctree.
src/peft/tuners/gralora/config.py
Outdated
| metadata={ | ||
| "help": ( | ||
| "gralora_k determines the number of subblocks in the GraLoRA adapter." | ||
| "The total parameter count is preserved regardles of gralora_k, while the expressivitiy is multiplied by gralora_k." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like the higher gralora_k, the better. Probably there is a tradeoff here. Could you please document it? Also, please mention the self.r % self.gralora_k != 0 constraint.
|
|
||
|
|
||
| @dataclass | ||
| class GraloraConfig(PeftConfig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add a docstring here. You can just copy the help text for the parameter description.
| subblock_in_features = self.in_features // gralora_k | ||
| subblock_out_features = self.out_features // gralora_k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Could you please add tests for these errors to test_initialization.py? Check the other tests there for inspiration. Please also add a test for the self.r % self.gralora_k != 0 error.
| general_gralora_A = nn.Linear(self.in_features, hybrid_r, bias=False) | ||
| general_gralora_B = nn.Linear(hybrid_r, self.out_features, bias=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a small optimization, you could use the torch.utils.integration.init_empty_weights context manager to initialize these parameters with empty tensors too.
tests/test_gralora.py
Outdated
| # Outputs should be different due to hybrid component | ||
| assert not torch.allclose(output_hybrid, output_pure, atol=1e-3) | ||
|
|
||
| def test_gralora_invalid_rank_zero(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to test_initialization.py
tests/test_gralora.py
Outdated
| with pytest.raises(ValueError, match="`r` should be a positive integer"): | ||
| get_peft_model(mlp, config) | ||
|
|
||
| def test_gralora_invalid_rank_negative(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to test_initialization.py
tests/test_gralora.py
Outdated
| assert model.base_model.model.lin1.bias.requires_grad | ||
| assert not model.base_model.model.lin0.bias.requires_grad | ||
|
|
||
| def test_gralora_multiple_adapters_with_bias_raises(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to test_initialization.py
tests/test_gralora.py
Outdated
| # Should match base model output (no merge) | ||
| assert torch.allclose(base_output, unloaded_output, atol=1e-5) | ||
|
|
||
| def test_gralora_merge_with_hybrid_component(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add a setting to test_custom_models.py with hybrid GraLoRA, this test would become redundant.
tests/test_gralora.py
Outdated
| return X | ||
|
|
||
|
|
||
| class TestGralora: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still a lot of tests here that are unnecessary because they're already covered by existing tests. Please remove them:
- test_gralora_forward_pass
- test_gralora_backward_pass
- test_gralora_save_load_roundtrip
- test_gralora_merge_and_unload
- test_gralora_merge_unmerge
- test_gralora_multiple_adapters
- test_gralora_dtype_compatibility
- test_gralora_disable_adapters
- test_gralora_trainable_parameters_only
- test_gralora_save_pretrained_files
- test_gralora_safe_merge_success
- test_gralora_safe_merge_detects_nan
- test_gralora_cpu_fp16_merge
- test_gralora_cpu_bf16_merge
- test_gralora_delete_adapter
- test_gralora_delete_nonexistent_adapter_raises
- test_gralora_unload_without_merge
- test_gralora_merge_with_adapter_names
- test_gralora_enable_disable_adapter_layers
- test_gralora_forward_with_merged_adapter
- test_gralora_forward_with_disable_adapters_and_merged
- test_gralora_add_non_active_adapter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have simply removed test_gralora.py, and integrated into existing files
|
@BenjaminBossan I’ve resolved all of your comments and applied the suggested changes. The main update is that I removed tests/test_gralora.py and integrated the related test cases into the existing test_initialization and test_custom_models files, including additional scenarios for Hybrid GraLoRA. |
Summary
We opened the initial PR for the GraLoRA method (a granular low-rank adaptation that improves expression power and outlier handling, selected as a NeurIPS 2025 Spotlight), based on #2636
Test Codes