Skip to content

Commit 42c1753

Browse files
authored
Add support for QLoRA/ QAdapter training via bitsandbytes (#663)
This PR adds support for wrapping bitsandbytes' `Linear4bit` and `Linear8bitLt` quantization layers with our LoRA implementation, enabling training LoRA adapters on quantized models in QLoRA style.
1 parent 233db31 commit 42c1753

File tree

9 files changed

+854
-20
lines changed

9 files changed

+854
-20
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,7 @@ Currently, adapters integrates all architectures and methods listed below:
155155
| (IA)^3 | [Liu et al. (2022)](https://arxiv.org/pdf/2205.05638.pdf) | [Docs](https://docs.adapterhub.ml/methods.html#ia-3) |
156156
| UniPELT | [Mao et al. (2022)](https://arxiv.org/pdf/2110.07577.pdf) | [Docs](https://docs.adapterhub.ml/method_combinations.html#unipelt) |
157157
| Prompt Tuning | [Lester et al. (2021)](https://aclanthology.org/2021.emnlp-main.243/) | [Docs](https://docs.adapterhub.ml/methods.html#prompt-tuning) |
158+
| QLoRA | [Dettmers et al. (2023)](https://arxiv.org/pdf/2305.14314.pdf) | [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/QLoRA_Llama_Finetuning.ipynb) |
158159

159160
## Supported Models
160161

docs/index.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ The framework consists of two main components:
2828
Currently, we support the PyTorch versions of all models as listed on the `Model Overview <model_overview.html>`_ page.
2929

3030
.. toctree::
31-
:maxdepth: 1
31+
:maxdepth: 2
3232
:caption: Getting Started
3333

3434
installation

docs/quickstart.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ In the following, we will briefly go through some examples to showcase these met
1414
`the 'Usage' section in Hugging Face's documentation <https://huggingface.co/docs/transformers/main/en/quicktour>`_.
1515
```
1616

17-
## Initialize Model with Adapters
17+
## Initialize a Model with Adapters
1818

1919
The `XAdapterModel` is the recommended model for training and inference of adapters:
2020

docs/training.md

+6
Original file line numberDiff line numberDiff line change
@@ -215,3 +215,9 @@ trainer = AdapterTrainer(
215215
When you migrate from the previous versions, which use the Trainer class for adapter training and fully fine-tuning, note that the
216216
specialized AdapterTrainer class does not have the parameters `do_save_full_model`, `do_save_adapters` and `do_save_adapter_fusion`.
217217
```
218+
219+
## Quantized Model Training
220+
221+
_Adapters_ supports fine-tuning of quantized language models similar to [QLoRA (Dettmers et al., 2023)](https://arxiv.org/pdf/2305.14314.pdf) via the `bitsandbytes` library integrated into Transformers.
222+
Quantized training is supported for LoRA-based adapters as well as bottleneck adapters and prefix tuning.
223+
Please refer to [this notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/QLoRA_Llama_Finetuning.ipynb) for a hands-on guide.

0 commit comments

Comments
 (0)