Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispatch Error When Using Quantisation #106

Open
Xmaster6y opened this issue Apr 7, 2024 · 1 comment
Open

Dispatch Error When Using Quantisation #106

Xmaster6y opened this issue Apr 7, 2024 · 1 comment
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@Xmaster6y
Copy link
Contributor

Description

I am witnessing a dispatch error when using 4bit quantised model. First, note that this is happening when instancitating a LanguageModel from an already existing transformer model in 4bit. Also, note that the 4bit weights should only lie on GPU, and can't go on CPU.

Working Example

from nnsight import LanguageModel

nnsight_model = LanguageModel("gpt2", device_map="auto", load_in_4bit=True)
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()

Failing Example

from nnsight import LanguageModel
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto", load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token
nnsight_model = LanguageModel(model, tokenizer=tokenizer)
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()

Info

  • nnsight 0.2.11
  • torch 2.2.1+cu121
  • transformer 4.38.2
  • accelerate 0.29.1
  • bitsandbytes 0.43.0

The Error

The error can be found in this illustrative notebook: https://colab.research.google.com/drive/1n9A7MF8JE2lf26e9gOXRi2HaDjl4DjgX?usp=sharing

@Xmaster6y
Copy link
Contributor Author

[Edit]

In fact, the first method only works with the first call, e.g., the following code fails:

from nnsight import LanguageModel

nnsight_model = LanguageModel("gpt2", device_map="auto", load_in_4bit=True)
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()
with nnsight_model.trace('The Eiffel Tower is in the city of') as tracer:
    hidden_states = nnsight_model.transformer.h[0].mlp.act.output[0].clone().save()

@JadenFiotto-Kaufman JadenFiotto-Kaufman added bug Something isn't working help wanted Extra attention is needed labels May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants