Skip to content

ONNX PTQ with NVFP4 fails on Vision CNN models #362

@tllewellynn1

Description

@tllewellynn1

Describe the bug

/examples)/onnx_ptq/torch_quant_to_onnx.py example is given and working for VIT based model. However when trying with Vision based model MobileNetv5_300m, Resnet50, Convnext, etc errors are produced consistently. A note in the documentation on the current limitations on what models can be converted using ONNX PTQ and nvfp4 would be helpful.

Steps/Code to reproduce bug

cd /examples)/onnx_ptq/
python torch_quant_to_onnx.py
--timm_model_name=resnet50
--quantize_mode=nvfp4
--onnx_save_path=resnet50_nvfp4.onnx

Expected behavior

Produces nvfp4 quantized onnx file without error.

System information

  • OS: Ubuntu 24.04.2 LTS
  • CPU architecture: x86_64
  • GPU name: NVIDIA RTX PRO 6000 Blackwell Workstation Edition
  • GPU memory size: 95.6 GB
  • Number of GPUs: 1
  • Library versions (if applicable):
    • Python: 3.12.3
    • ModelOpt version or commit hash: 0.37.0.dev56+g26c203abd.d20250924
    • CUDA: 13.0
    • PyTorch: 2.10.0.dev20250924+cu130
    • Transformers: 4.56.2
    • ONNXRuntime: 1.22.0
    • TensorRT: 10.13.3.9

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions