You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What about adding low bit floating point data types? Default would be to dequant to fp16.
On top I'm writing an efficient SIMD backend that works in quantized space for the data types mentioned below. Up to 8bit width is supported with my approach.
Especially nf4, nf4dq (double quant) and fp4 would enable lossless conversion from existing 4bit bnb encoded models from HF.
Just want to hear your opinions / interest on this topic. I'll come back with a PR at some point when my backend is ready.
Also, if I missed an important low bit data type just point it out.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
What about adding low bit floating point data types? Default would be to dequant to fp16.
On top I'm writing an efficient SIMD backend that works in quantized space for the data types mentioned below. Up to 8bit width is supported with my approach.
Especially nf4, nf4dq (double quant) and fp4 would enable lossless conversion from existing 4bit bnb encoded models from HF.
Just want to hear your opinions / interest on this topic. I'll come back with a PR at some point when my backend is ready.
Also, if I missed an important low bit data type just point it out.
nf4
nf4dq
fp8e5m2
fp8e4m3
fp8e3m4
fp6e4m1
fp6e3m2
fp6e2m3
fp5e3m1
fp5e2m2
fp5e1m3
fp4e3m0
fp4e2m1
fp4e1m2
fp3e2m0
fp3e1m1
Beta Was this translation helpful? Give feedback.
All reactions