Replies: 2 comments
-
Hey, this sounds like an interesting project, looking forward to your implementation! It seems like you already pretty correctly summarized the current limitations. Leveraging the ForwardContext would be the way to go for adding additional forward arguments to existing models. Unfortunately, the ForwardContext is only applied to the forward pass of base model classes (ie tldr: currently, there's no elegant way to add this for all models with prediction heads. But I think what I could look into is to adapt the current ForwardContext logic in a way that the classes with heads get their forward method wrapped by default while avoiding creation of a second context in the base model forward. |
Beta Was this translation helpful? Give feedback.
-
@FrLdy update on this: I've drafted a PR here: #789 to make it easier to pass custom args to a model via the ForwardContext. Feel free to give it a try and let me know if this helps! |
Beta Was this translation helpful? Give feedback.
-
Hello everyone,
I want to implement the paper MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning.
I read in the documentation that the
Parallel
composition can be used to enable parallel multi-task training. However, it seems that in this particular case, theBatchSplit
composition might be more suitable for routing tasks to the appropriate LoRA module.I started extending the
BatchSplit
concept into a dynamic version that uses atask_ids
parameter in theforward
method, storing it in the context viaforward_context
.To follow @calpt's recommendation, I decided to rely on Hugging Face's classes through
adapters.init
. However, when adding and running tests fortest_adapter_composition
(which usestransformers.BertForSequenceClassification
), I noticed that Hugging Face model classes do not account for**kwargs
in their method signatures. This prevents simply passing a new parameter liketask_ids
for handling it throughforward_context
.Additionally, I also tested creating a new class inheriting from
ModelBaseAdaptersMixin
, inspired byadapters.T5ForConditionalGenerationWithHeadsMixin
, to ensure the context is initialized during the firstforward
call. However, this results in two contexts being created.Do you think there’s an elegant way to implement this functionality, or would it be necessary to override the
forward
methods of each model to properly handle the additional parameter?Any guidance or recommendations would be greatly appreciated. 😃
Beta Was this translation helpful? Give feedback.
All reactions