-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Integrate ABBA to LoHA #2842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Integrate ABBA to LoHA #2842
Conversation
…feature. This update introduces additional parameters `r1` and `r2` for specifying separate ranks for the Hadamard components in the LoHa configuration. The `use_khatri_rao` flag is added to enable Khatri-Rao product optimization, reducing memory overhead during weight updates. The initialization method is also enhanced to support ABBA-style initialization, allowing for more efficient weight management. These changes improve flexibility and performance in model training.
|
@Hyperakan In addition to type hints and docstrings we talked earlier, could you also test it with the method comparison suite like in this comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on the ABBA method. This contribution already looks quite clean and well integrated, well done.
I found a couple of smaller issues, please check my comments. Also, as a next step, could you please:
- Add documentation (either extend LoHa or add a separate section for ABBA).
- Extend the tests: For a good start, add some ABBA settings here.
Pinging @RaghavSinghal10 if you have the opportunity, it would be great if you could review, with the focus on the algorithm itself 🙏.
src/peft/tuners/loha/config.py
Outdated
| r1: Optional[int] = field( | ||
| default=None, | ||
| metadata={ | ||
| "help": ( | ||
| "Rank for the first Hadamard component (w1a @ w1b). " | ||
| "If not specified, defaults to r/2 for ABBA-style initialization, or r otherwise." | ||
| ) | ||
| }, | ||
| ) | ||
| r2: Optional[int] = field( | ||
| default=None, | ||
| metadata={ | ||
| "help": ( | ||
| "Rank for the second Hadamard component (w2a @ w2b). " | ||
| "If not specified, defaults to r/2 for ABBA-style initialization, or r otherwise." | ||
| ) | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that these two parameters are described in the paper, but I think it's too much to have 3 parameters to set the rank. I would suggest either:
- Drop r1 and r2, always use r1=r2=r.
- If it's important to have r1!=r2 as an option, I would drop r1, use r and r2 only, where r takes the role of r1 and r2 defaults to r.
Unless I'm missing something, I wouldn't use r/2 as default, as I think it's confusing that changing from LoHa to ABBA halves the effective rank.
src/peft/tuners/loha/config.py
Outdated
| }, | ||
| ) | ||
| use_khatri_rao: bool = field( | ||
| default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for the future: If we can observe that the numerical difference of using Khatri-Rao is negligible and it is strictly better memory-wise, I wouldn't be opposed to make it the default.
If we want to keep the default to be False, I would, however, change it to "auto" and then, when ABBA is used, "auto" -> True and for LoHa "auto" -> False. If the user sets True or False, it should be respected. As is, it's impossible for a user to use ABBA without use_khatri_rao.
| self.use_khatri_rao = {} | ||
|
|
||
| # Store separate ranks for ABBA (r1 for first component, r2 for second) | ||
| self.r1 = {} | ||
| self.r2 = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new parameters have to be added to other_param_names (see line 28).
src/peft/tuners/loha/layer.py
Outdated
| self.hada_w2_a[adapter_name].data.copy_(U_r2 * fourth_root_S2) | ||
| self.hada_w2_b[adapter_name].data.copy_(fourth_root_S2.unsqueeze(1) * Vh_r2) | ||
|
|
||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, there is no need to catch the exception. As a user, if I explicitly indicated that I want ABBA and it fails, I would rather get an error than a warning.
src/peft/tuners/loha/layer.py
Outdated
| self.scaling[adapter_name] = alpha / r # Original scaling (for backward compatibility) | ||
|
|
||
| # ABBA paper: separate scaling factors α/√r₁ and α/√r₂ | ||
| import math |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the import to root level.
src/peft/tuners/loha/layer.py
Outdated
| self.scaling1 = getattr(self, 'scaling1', {}) | ||
| self.scaling2 = getattr(self, 'scaling2', {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These attributes should be set during __init__ and added to other_param_names.
- `r` now takes the role of `r1`, with `r2` defaulting to `r` - Added new parameters to `other_param_names` - ABBA now raises an error instead of a warning on failure - Changed `use_khatri_rao` default to "auto": - Automatically set to True for ABBA - Automatically set to False for LoHa - Respects user-defined True/False values Next steps: - Add documentation - Extend test coverage
|
Thanks a lot guys - @monatis and @Hyperakan for adding ABBA into PEFT!! Sure, I will review the implementation once. Regarding the documentation, having a separate section for ABBA would be way more preferred from our end - we can help with this as well if you want. Thanks again! |
|
I’ve implemented the requested changes. All tests are passing now, and the test coverage can be extended further. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, the PR looks quite good. Could you please run make style to satisfy ruff?
Also, let's add the documentation for ABBA. One suggestion is that @RaghavSinghal10 writes the docs and I would later add him as co-author, if you're both fine with that suggestion.
All tests are passing now, and the test coverage can be extended further.
Let's extend the test_stablediffusion.py and test_vision_models.py. They already have a LoHa config, so just copying it and enabling ABBA should be enough.
|
This works for me - I can go ahead and write the documentation if it works for @Hyperakan |
That works for me as well, thanks! |
…ls.py Also formatted code using make style.
62dcb13 to
6e7a2c0
Compare
|
Now only the following test is failing: I couldn’t identify the root cause of the mismatch. It fails for the regular LoHa configuration (i.e. init_weights = False), but it doesn’t fail in the original codebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @Hyperakan. When I run the tests locally with GPU, some ABBA tests with Conv1D layers are failing: pytest tests/test_custom_models.py -k "abba and conv1d":
FAILED tests/test_custom_models.py::TestPeftCustomModel::test_merge_layers[Conv1d ABBA 3-Conv1dBigger-LoHaConfig-config_kwargs84] - AssertionError
FAILED tests/test_custom_models.py::TestPeftCustomModel::test_safe_merge[Conv1d ABBA 3-Conv1dBigger-LoHaConfig-config_kwargs84] - AssertionError
FAILED tests/test_custom_models.py::TestPeftCustomModel::test_disable_adapters_with_merging[Conv1d ABBA 3-Conv1dBigger-LoHaConfig-config_kwargs84] - AssertionError: assert False
This just appears to be a matter of tolerances, so e.g. loosening from 1e-4 to 1e-3 for ABBA should be enough. Could you please check? On CPU, they pass for me.
Now only the following test is failing: FAILED tests/test_stablediffusion.py::TestStableDiffusionModel::test_merge_layers_safe_merge[LoHaConfig-config_kwargs2-hf-internal-testing/tiny-sd-pipe] - AssertionError: False, 1.0
I couldn’t identify the root cause of the mismatch. It fails for the regular LoHa configuration (i.e. init_weights = False), but it doesn’t fail in the original codebase.
This was quite a rabbit hole. Indeed, the addition of the ABBA tests results in another LoHa test to fail. That test passes when run in isolation. Therefore, the ABBA test must have some kind of side-effect that makes the LoHa test fail. After some investigation, it's actually caused by this line in HadaWeightKR.forward:
diff_weight = torch.empty(m, n, dtype=w1a.dtype, device=w1a.device)
For some reason unclear to me, if we replace this with torch.zeros, the test passes. At first I thought that the loop could be wrong and some empty data is being returned, which somehow stays allocated and thus affects the subsequent tests (LoHa uses torch.empty for weight initialization) but AFAICT, there is no issue with the loop. Also, calling torch.cuda.empty_cache() between tests also doesn't solve the issue. Therefore, I have no idea what the underlying cause is. Probably it's somewhere else in the LoHa code (or even a PyTorch bug?) but I can't see anything.
For now, I would probably change the torch.empty call to torch.zeros to be safe, even though that's of course more expensive. LMK if you have any ideas. Update: Another idea: collect the chunks in a list and concat it at the end. Update2: I tried the idea of collecting and concatenating the chunks, thus avoiding torch.empty but the old test still fails, what the heck?? It's got to be some kind of CUDA side effect, as running this on CPU works.
Apart from that, just two small comments.
src/peft/tuners/loha/layer.py
Outdated
| """ | ||
|
|
||
| @staticmethod | ||
| def forward(ctx, w1a, w1b, w2a, w2b, scale1=torch.tensor(1), scale2=torch.tensor(1)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def forward(ctx, w1a, w1b, w2a, w2b, scale1=torch.tensor(1), scale2=torch.tensor(1)): | |
| def forward(ctx, w1a, w1b, w2a, w2b, scale1, scale2): |
Not necessary, right?
tests/test_stablediffusion.py
Outdated
| assert np.allclose(peft_output, merged_output, atol=1.0) | ||
| # Increase tolerance for LoHa when init_weights is set to "abba" | ||
| if config_cls == LoHaConfig and original_config_kwargs.get("text_encoder", {}).get("init_weights") == "abba": | ||
| atol = 15.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit concerning, any idea why ABBA is much more susceptible here?
|
@Hyperakan I found a better solution to the failing stable diffusion test: # test_stablediffusion.py
import torch
[...]
class TestStableDiffusionModel(PeftCommonTester):
[...]
@pytest.fixture(autouse=True)
def autofixture(self):
# Disable TF32 in cudnn
prev_value = torch.backends.cudnn.allow_tf32
try:
torch.backends.cudnn.allow_tf32 = False
yield
finally:
torch.backends.cudnn.allow_tf32 = prev_value
torch.cuda.empty_cache()For some reason, |
|
Quick question - where all do I need to update the documentation? I was planning to fill in this page (https://github.com/huggingface/peft/blob/main/docs/source/developer_guides/lora.md) and add an abba.md file under https://github.com/huggingface/peft/tree/main/docs/source/package_reference. Would that be sufficient, @BenjaminBossan ? Thanks in advance! |
This wouldn't be a good fit, as it only talks about LoRA.
Yes, please put a file for ABBA there, check the other files to see what content is expected. Also mention that it's implemented in terms of LoHa. Don't forget to add an entry in the |
…rd method Additionally, added a fixture in test_stablediffusion.py to disable TF32 in cudnn for consistent testing behavior.
After implementing your suggested solution, I didn’t need to set a special atol value for abba. It's passing all the tests now. |
Great, all the better. It seems like there are some merge conflicts with the latest main branch, could you please merge it @Hyperakan? |
This PR integrates ABBA to LoHA, based on the discussion here.
Raising it to track the progress more easily.