Dispatch Error When Using Quantisation #106

Xmaster6y · 2024-04-07T16:06:36Z

Description

I am witnessing a dispatch error when using 4bit quantised model. First, note that this is happening when instancitating a LanguageModel from an already existing transformer model in 4bit. Also, note that the 4bit weights should only lie on GPU, and can't go on CPU.

Working Example

from nnsight import LanguageModel

nnsight_model = LanguageModel("gpt2", device_map="auto", load_in_4bit=True)
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()

Failing Example

from nnsight import LanguageModel
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto", load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token
nnsight_model = LanguageModel(model, tokenizer=tokenizer)
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()

Info

nnsight 0.2.11
torch 2.2.1+cu121
transformer 4.38.2
accelerate 0.29.1
bitsandbytes 0.43.0

The Error

The error can be found in this illustrative notebook: https://colab.research.google.com/drive/1n9A7MF8JE2lf26e9gOXRi2HaDjl4DjgX?usp=sharing

The text was updated successfully, but these errors were encountered:

Xmaster6y · 2024-04-07T16:32:14Z

[Edit]

In fact, the first method only works with the first call, e.g., the following code fails:

from nnsight import LanguageModel

nnsight_model = LanguageModel("gpt2", device_map="auto", load_in_4bit=True)
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()

JadenFiotto-Kaufman added bug Something isn't working help wanted Extra attention is needed labels May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dispatch Error When Using Quantisation #106

Dispatch Error When Using Quantisation #106

Xmaster6y commented Apr 7, 2024

Xmaster6y commented Apr 7, 2024

Dispatch Error When Using Quantisation #106

Dispatch Error When Using Quantisation #106

Comments

Xmaster6y commented Apr 7, 2024

Description

Working Example

Failing Example

Info

The Error

Xmaster6y commented Apr 7, 2024