Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable Model Integration Interface #738

Merged
merged 41 commits into from
Mar 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
6366685
Init pluggable interface
calpt Aug 24, 2024
f004a10
Test fixes
calpt Aug 25, 2024
3c4c791
doc
calpt Aug 25, 2024
9cef6c6
style
calpt Aug 25, 2024
2171409
Fix iter_layers & add tests
lenglaender Aug 29, 2024
89e7aa7
fix
calpt Sep 15, 2024
31c8b2a
Merge branch 'main' into dev/interface
calpt Dec 23, 2024
5305c36
wip: bottleneck
calpt Dec 23, 2024
0d623ef
Minimal working bottleneck plugin version
calpt Dec 24, 2024
5334521
style
calpt Dec 24, 2024
62c1f83
attr fix
calpt Dec 24, 2024
1f0d3ef
add emb training support
calpt Dec 24, 2024
6338a79
style
calpt Dec 24, 2024
3e90a3a
simple prompt tuning implementation
calpt Dec 25, 2024
ea85e43
Extended interface for more bottleneck support
calpt Dec 25, 2024
f4d7967
load_model() concurrency fix
calpt Dec 25, 2024
652b562
fix init for replaced classes
calpt Dec 26, 2024
e26c425
Merge branch 'main' into dev/interface
calpt Dec 26, 2024
05e8b2a
clean up adapters init
calpt Dec 26, 2024
b01cf6d
fixes
calpt Dec 26, 2024
535dd9c
Add `supports_adapter()` method
calpt Jan 5, 2025
7d346db
Rename AdapterType -> AdapterMethod. Test fixes.
calpt Jan 6, 2025
3a9e702
WIP: invertible adapters support
calpt Jan 6, 2025
074ca66
Add invertible output layer. Test fixes.
calpt Jan 6, 2025
5596bf1
Merge branch 'main' into dev/interface
calpt Jan 9, 2025
ec4ae1a
Merge branch 'main' into dev/interface
calpt Jan 29, 2025
c9098bd
Fix test after refactoring
calpt Feb 3, 2025
f7d59c0
remove code
calpt Feb 3, 2025
cbf74a9
style
calpt Feb 3, 2025
788bc8d
Save & load adapter interface with full model
calpt Feb 8, 2025
3d085fd
rename adapter_types -> adapter_methods
calpt Feb 9, 2025
f3c43a5
Add documentation
calpt Feb 9, 2025
9ae0afe
Update docs/model_overview.md
calpt Feb 10, 2025
788a76f
Update docs/plugin_interface.md
calpt Feb 10, 2025
da47767
test fixes
calpt Feb 17, 2025
59f4a61
style
calpt Feb 17, 2025
7da9dc9
revert
calpt Feb 17, 2025
984745d
Merge branch 'main' into dev/interface
calpt Mar 1, 2025
a9d318e
fix some errors
lenglaender Mar 2, 2025
0f3c1e8
fix
lenglaender Mar 2, 2025
eb2f03a
Override test_get_adapter correctly
calpt Mar 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/classes/adapter_model_interface.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Adapter Model Interface
=======================

.. autoclass:: adapters.AdapterModelInterface
:members:

.. autoclass:: adapters.AdapterMethod
:members:
7 changes: 7 additions & 0 deletions docs/contributing/adding_adapters_to_a_model.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@
# Adding Adapters to a Model

```{eval-rst}
.. important::
For most use cases, it can be much easier support a new model architecture via the new adapter plugin interface.
Check out `Custom Models <../plugin_interface.html>`_ for more.
```

This document gives an overview of how new model architectures of Hugging Face Transformers can be supported by `adapters`.
Before delving into implementation details, you should familiarize yourself with the main design philosophies of `adapters`:

Expand Down
4 changes: 3 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ Currently, we support the PyTorch versions of all models as listed on the `Model
merging_adapters
prediction_heads
embeddings
extending

.. toctree::
:maxdepth: 2
Expand All @@ -66,6 +65,7 @@ Currently, we support the PyTorch versions of all models as listed on the `Model
:caption: Supported Models

model_overview
plugin_interface
classes/models/albert
classes/models/auto
classes/models/bart
Expand Down Expand Up @@ -99,6 +99,7 @@ Currently, we support the PyTorch versions of all models as listed on the `Model
classes/adapter_config
classes/model_adapters_config
classes/adapter_layer
classes/adapter_model_interface
classes/model_mixins
classes/adapter_training
classes/adapter_utils
Expand All @@ -110,6 +111,7 @@ Currently, we support the PyTorch versions of all models as listed on the `Model
contributing
contributing/adding_adapter_methods
contributing/adding_adapters_to_a_model
extending

Citation
========
Expand Down
7 changes: 5 additions & 2 deletions docs/model_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ The table below further shows which model architectures support which adaptation

| Model | (Bottleneck)<br> Adapters | Prefix<br> Tuning | LoRA | Compacter | Adapter<br> Fusion | Invertible<br> Adapters | Parallel<br> block | Prompt<br> Tuning | ReFT |
| --------------------------------------- | -| - | - | - | - | - | - |- | - |
| [Custom models](plugin_interface.html) | ✅(°) | | ✅ | ✅ | ✅ | ✅ | | ✅ | ✅ |
| [ALBERT](classes/models/albert.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [BART](classes/models/bart.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | ✅ |
| [BEIT](classes/models/beit.html) | ✅ | ✅ | ✅ | ✅ | ✅ | | | ✅ | ✅ |
Expand All @@ -38,9 +39,11 @@ The table below further shows which model architectures support which adaptation
| [XLM-RoBERTa](classes/models/xlmroberta.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| [X-MOD](classes/models/xmod.html) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

(°) `original_ln_after=False` is unsupported for bottleneck configs.
(*) If the used encoder and decoder model class are supported.

**Missing a model architecture you'd like to use?**
adapters can be easily extended to new model architectures as described in [Adding Adapters to a Model](https://docs.adapterhub.ml/contributing/adding_adapters_to_a_model.html).
**Missing a model architecture you'd like to use?**
The new model plugin interface makes it easy to support new transformer models with just a few lines of code [Learn more](plugin_interface.md).
Also, _Adapters_ can be extended to new model architectures as described in [Adding Adapters to a Model](https://docs.adapterhub.ml/contributing/adding_adapters_to_a_model.html).
Feel free to [open an issue](https://github.com/Adapter-Hub/adapters/issues) requesting support for a new architecture.
_We very much welcome pull requests adding new model implementations!_
94 changes: 94 additions & 0 deletions docs/plugin_interface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Custom Models

The _Adapters_ library provides a simple mechanism for integrating adapter methods into any available _Transformers_ model - including custom architectures.
This can be accomplished by defining a plugin interface instance of [`AdapterModelInterface`](adapters.AdapterModelInterface).
The following example shows how this looks like for Gemma 2:

```python
import adapters
from adapters import AdapterModelInterface
from transformers import AutoModelForCausalLM

plugin_interface = AdapterModelInterface(
adapter_methods=["lora", "reft"],
model_embeddings="embed_tokens",
model_layers="layers",
layer_self_attn="self_attn",
layer_cross_attn=None,
attn_k_proj="k_proj",
attn_q_proj="q_proj",
attn_v_proj="v_proj",
attn_o_proj="o_proj",
layer_intermediate_proj="mlp.up_proj",
layer_output_proj="mlp.down_proj",
)

model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it", token="<YOUR_TOKEN>")
adapters.init(model, interface=plugin_interface)

model.add_adapter("my_adapter", config="lora")

print(model.adapter_summary())
```

## Walkthrough

Let's go through what happens in the example above step by step:

**1. Define adapter methods to plug into a model:**
The `adapter_methods` argument is the central parameter to configure which adapters will be supported in the model.
Here, we enable all LoRA and ReFT based adapters.
See [`AdapterMethod`](adapters.AdapterMethod) for valid options to specify here.
Check out [Adapter Methods](methods.md) for detailed explanation of the methods.

**2. Define layer and module names:**
While all Transformers layers share similar basic components, their implementation can differ in terms of subtleties such as module names.
Therefore, the [`AdapterModelInterface`](adapters.AdapterModelInterface) needs to translate the model-specific module structure into a common set of access points for adapter implementations to hook in.
The remaining attributes in the definition above serve this purpose.
Their attribute names follow a common syntax that specify their location and purpose:
- The initial part before the first "_" defines the base module relative to which the name should be specified.
- The remaining part after the first "_" defines the functional component.

E.g., `model_embeddings` identifies the embeddings layer (functional component) relative to the base model (location).
`layer_output_proj` identifies the FFN output projection relative to one Transformer layer.
Each attribute value may specify a direct submodule of the reference module (`"embed_token"`) or a multi-level path starting at the reference module (`"mlp.down_proj"`).

**3. (optional) Extended interface attributes:**
There are a couple of attributes in the [`AdapterModelInterface`](adapters.AdapterModelInterface) that are only required for some adapter methods.
We don't need those in the above example for LoRA and ReFT, but when supporting bottleneck adapters as well, the full interface would look as follows:
```python
adapter_interface = AdapterModelInterface(
adapter_types=["bottleneck", "lora", "reft"],
model_embeddings="embed_tokens",
model_layers="layers",
layer_self_attn="self_attn",
layer_cross_attn=None,
attn_k_proj="k_proj",
attn_q_proj="q_proj",
attn_v_proj="v_proj",
attn_o_proj="o_proj",
layer_intermediate_proj="mlp.up_proj",
layer_output_proj="mlp.down_proj",
layer_pre_self_attn="input_layernorm",
layer_pre_cross_attn=None,
layer_pre_ffn="pre_feedforward_layernorm",
layer_ln_1="post_attention_layernorm",
layer_ln_2="post_feedforward_layernorm",
)
```

**4. Initialize adapter methods in the model:**
Finally, we just need to apply the defined adapter integration in the target model.
This can be achieved using the usual `adapters.init()` method:
```python
adapters.init(model, interface=adapter_interface)
```
Now, you can use (almost) all functionality of the _Adapters_ library on the adapted model instance!

## Limitations

The following features of the _Adapters_ library are not supported via the plugin interface approach:
- Prefix Tuning adapters
- Parallel composition blocks
- XAdapterModel classes
- Setting `original_ln_after=False` in bottleneck adapter configurations (this affects `AdapterPlusConfig`)
2 changes: 2 additions & 0 deletions src/adapters/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
"Seq2SeqLMHead",
"TaggingHead",
],
"interface": ["AdapterMethod", "AdapterModelInterface"],
"methods.adapter_layer_base": ["AdapterLayerBase", "ComposableAdapterLayerBase"],
"model_mixin": [
"EmbeddingAdaptersMixin",
Expand Down Expand Up @@ -198,6 +199,7 @@
Seq2SeqLMHead,
TaggingHead,
)
from .interface import AdapterMethod, AdapterModelInterface
from .methods.adapter_layer_base import AdapterLayerBase, ComposableAdapterLayerBase
from .model_mixin import (
EmbeddingAdaptersMixin,
Expand Down
121 changes: 121 additions & 0 deletions src/adapters/interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
import json
import os
from dataclasses import asdict, dataclass
from typing import List, Optional

from transformers.utils import cached_file

from . import __version__
from .utils import INTERFACE_CONFIG_NAME


class AdapterMethod:
"""
Enum of all supported adapter method types.

Attributes:
bottleneck: Adapter methods using bottleneck layers.
prefix_tuning: Adapters methods based on Prefix Tuning. Note that this is currently unsupported via AdapterModelInterface.
lora: Adapter methods based on low-rank adaptation.
prompt_tuning: Adapter methods based on Prompt Tuning.
reft: Adapters methods based on Representation Fine-Tuning.
invertible: Adapter methods using invertible modules.
"""

bottleneck = "bottleneck"
prefix_tuning = "prefix_tuning"
lora = "lora"
prompt_tuning = "prompt_tuning"
reft = "reft"
invertible = "invertible"

@staticmethod
def get_from_config(config) -> List[str]:
"""
Get the adapter type from a given adapter config.

Args:
config: The adapter config.

Returns:
List[str]: The adapter type.
"""
methods = []
if getattr(config, "inv_adapter", False):
methods.append(AdapterMethod.invertible)
if config.architecture is None:
methods.append(AdapterMethod.bottleneck)
elif config.architecture == "union":
for sub_config in config.configs:
methods.extend(AdapterMethod.get_from_config(sub_config))
else:
methods.append(config.architecture)
return methods


@dataclass
class AdapterModelInterface:
"""
Defines the main interface for integrating adapter methods into a model class.
This interface translates generic accessor names to model-specific attribute names.

Args:
adapter_methods (List[str]): List of adapter types that are supported by the model.
model_embeddings (str): Name of the model's embedding layer.
model_layers (str): Name of the model's layer list.
layer_self_attn (str): Name of the self-attention layer in a transformer layer.
layer_cross_attn (str): Name of the cross-attention layer in a transformer layer.
attn_k_proj (str): Name of the key projection layer in an attention layer.
attn_q_proj (str): Name of the query projection layer in an attention layer.
attn_v_proj (str): Name of the value projection layer in an attention layer.
attn_o_proj (str): Name of the output projection layer in an attention layer.
layer_intermediate_proj (str): Name of the intermediate projection layer in a transformer layer.
layer_output_proj (str): Name of the output projection layer in a transformer layer.
layer_pre_self_attn (Optional[str]): Hook point directly before the self attention layer. Used for extended bottleneck adapter support.
layer_pre_cross_attn (Optional[str]): Hook point directly before the cross attention layer. Used for extended bottleneck adapter support.
layer_pre_ffn (Optional[str]): Hook point directly before the feed forward layer. Used for extended bottleneck adapter support.
layer_ln_1 (Optional[str]): Layer norm *after* the self-attention layer. Used for extended bottleneck adapter support.
layer_ln_2 (Optional[str]): Layer norm *after* the feed forward layer. Used for extended bottleneck adapter support.
"""

adapter_methods: List[str]

model_embeddings: str
model_layers: str

layer_self_attn: str
layer_cross_attn: str
attn_k_proj: str
attn_q_proj: str
attn_v_proj: str
attn_o_proj: str

layer_intermediate_proj: str
layer_output_proj: str

# Optional attributes for extended bottleneck adapter support
layer_pre_self_attn: Optional[str] = None
layer_pre_cross_attn: Optional[str] = None
layer_pre_ffn: Optional[str] = None
layer_ln_1: Optional[str] = None
layer_ln_2: Optional[str] = None

def to_dict(self):
return asdict(self)

def _save(self, save_directory, model_config):
config_dict = {
"model_type": model_config.model_type,
"interface": self.to_dict(),
"version": "adapters." + __version__,
}
save_path = os.path.join(save_directory, INTERFACE_CONFIG_NAME)
with open(save_path, "w") as f:
json.dump(config_dict, f, indent=2, sort_keys=True)

@classmethod
def _load(cls, path_or_repo_id: str, **kwargs):
resolved_file = cached_file(path_or_repo_id, INTERFACE_CONFIG_NAME, **kwargs)
with open(resolved_file, "r") as f:
config_dict = json.load(f)
return AdapterModelInterface(**config_dict["interface"])
14 changes: 14 additions & 0 deletions src/adapters/methods/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
from .bottleneck import init_bottleneck
from .invertible import init_invertible_adapters
from .lora import init_lora
from .prompt_tuning import init_prompt_tuning
from .reft import init_reft


METHOD_INIT_MAPPING = {
"bottleneck": init_bottleneck,
"lora": init_lora,
"prompt_tuning": init_prompt_tuning,
"reft": init_reft,
"invertible": init_invertible_adapters,
}
Loading