4 bit and 8 bit bnb quants only generate empty strings or one token repeated endlessly

#32
by nicorinn-google - opened

I'm using the same exact notebook for the 27b-it and 9b-it versions, so the issue is definitely related to this model. Any ideas of what the cause may be?

Hi @nicorinn-google , please use torch_dtype=torch.bfloat16 when loading with from_pretrained(). There's a PR to update the model card examples here: #33.

Google org

Hi @mdouglas , Kindly update the bitsandbytes examples to load the model using torch_dtype=torch.bfloat16. I have tested and reproduced. Please refer this gist file for reference. If you have any concerns let me know will assist you.

Thank you.

Sign up or log in to comment