-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zipformer Onnx FP16 #1671
Zipformer Onnx FP16 #1671
Conversation
Signed-off-by: manickavela29 <[email protected]>
Could you describe how you test it? Does it work on CPU. Also, could you use fp16.onnx as the suffix when fp16 is used? |
Signed-off-by: manickavela29 <[email protected]>
I have tested with A10 GPU and it is holding good in sherpa-onnx I havn't tested this on CPU, but AVX512 has support for sure, I think AVX2 will also handle it. Generally onnxrt will have fall back mechanism if fp16 inherently not there in fp32 conversion happen implicitly This is an optional export flag, and wanted to keep it simple with the existing model itself. But if this is still required, I will modify the filename still. FYI, |
Could you test the fp16 model either in icefall with onnx_pretrained.py or in sherpa-onnx with CPU?
Currently, we have |
Signed-off-by: manickavela29 <[email protected]>
I ran it once with sherpa, and it was up and running,
|
Signed-off-by: manickavela29 <[email protected]>
Thanks! Is it ready to merge? |
yes, completed from my end! |
Thank you for your contribution! |
Signed-off-by: manickavela29 <[email protected]>
Hi @csukuangfj , @yaozengwei
Exporting zipformer onnx model in FP16
Model is trained in mixed precision, therefore having fp16 shouldn't give any loss
Tested with data, and model accuracy is exactly same as fp32.
cc: k2-fsa/sherpa-onnx#41 , k2-fsa/sherpa-onnx#40