You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am witnessing a dispatch error when using 4bit quantised model. First, note that this is happening when instancitating a LanguageModel from an already existing transformer model in 4bit. Also, note that the 4bit weights should only lie on GPU, and can't go on CPU.
Working Example
fromnnsightimportLanguageModelnnsight_model=LanguageModel("gpt2", device_map="auto", load_in_4bit=True)
withnnsight_model.trace('The Eiffel Tower is in the city of') astracer:
hidden_states=nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()
Failing Example
fromnnsightimportLanguageModelfromtransformersimportAutoModelForCausalLM, AutoTokenizermodel=AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto", load_in_4bit=True)
tokenizer=AutoTokenizer.from_pretrained("gpt2")
tokenizer.pad_token=tokenizer.eos_tokennnsight_model=LanguageModel(model, tokenizer=tokenizer)
withnnsight_model.trace('The Eiffel Tower is in the city of') astracer:
hidden_states=nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()
In fact, the first method only works with the first call, e.g., the following code fails:
fromnnsightimportLanguageModelnnsight_model=LanguageModel("gpt2", device_map="auto", load_in_4bit=True)
withnnsight_model.trace('The Eiffel Tower is in the city of') astracer:
hidden_states=nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()
withnnsight_model.trace('The Eiffel Tower is in the city of') astracer:
hidden_states=nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()
Description
I am witnessing a dispatch error when using 4bit quantised model. First, note that this is happening when instancitating a
LanguageModel
from an already existingtransformer
model in 4bit. Also, note that the 4bit weights should only lie on GPU, and can't go on CPU.Working Example
Failing Example
Info
The Error
The error can be found in this illustrative notebook: https://colab.research.google.com/drive/1n9A7MF8JE2lf26e9gOXRi2HaDjl4DjgX?usp=sharing
The text was updated successfully, but these errors were encountered: