-
Notifications
You must be signed in to change notification settings - Fork 19
Open
Description
Hello, I wonder if you can help me look at the following problem. I used your script to run the ppl verification, and ran it on the OPT-1.3B model in script/lut/eval_opt.sh. According to the results in your paper, LUT-GEMM has a ppl=49.10 in the wiki2 of W3A16 of OPT-1.3B, but using this script, I get ppl=68.56. I wonder if I have configured something wrong.
=========== quantize with original bcq method ===========
CUDA_VISIBLE_DEVICES=2 python model/opt.py
opt-1.3b
--wbits 3
--groupsize -1
--bcq
--bcq_round 50
--use_bst \
'wikitext2':68.55980682373047
'ptb': 112.10599517822266
=========== quantize with bcq+gptq method ===========
CUDA_VISIBLE_DEVICES=2 python model/opt.py
opt-1.3b
--wbits 3
--groupsize -1
--lut_eval
--bcq_round 50
--use_bst
'wikitext2':130.72769165039062
'ptb': 182.0449676513672
Metadata
Metadata
Assignees
Labels
No labels