Skip to content

Conversation

galagam
Copy link

@galagam galagam commented Sep 28, 2025

What does this PR do?

Type of change: ? new feature

Overview:
Add options nodes_to_include, op_types_to_include that force-include nodes in the conversion, overriding NodeClassifier exclusion logic

Usage

# force-include all Conv nodes for conversion, even if NodeClassifer logic states they should be excluded (kept in high precision)
python3 -m modelopt.onnx.autocast --onnx_path model.onnx --op_types_to_include Conv

# exclude all Conv nodes, include only node named Conv_96
python3 -m modelopt.onnx.autocast --onnx_path model.onnx op_types_to_exclude Conv --nodes_to_include Conv_96

Testing

pytest tests/unit/onnx/autocast/test_nodeclassifier.py::test_node_classifier_force_include

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: Yes
  • Did you add or update any necessary documentation?: Yes
  • Did you update Changelog?: Yes

Additional Information

Summary by CodeRabbit

  • New Features

    • Add options to force-include specific nodes or operator types for reduced-precision conversion (available via CLI flags and Python API parameters); include rules can now override exclusion rules to enforce low-precision.
  • Documentation

    • Guide updated with new include options, usage examples, and clarified precision-decision rules.
  • Tests

    • New unit tests validating force-include behavior and its interaction with exclusion rules.
  • Chores

    • CHANGELOG updated.

@galagam galagam requested a review from a team as a code owner September 28, 2025 15:46
@galagam galagam requested a review from ajrasane September 28, 2025 15:46
Copy link

copy-pr-bot bot commented Sep 28, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

coderabbitai bot commented Sep 28, 2025

Walkthrough

Adds force-include controls to AutoCast: new include parameters in CLI and Python API propagate to NodeClassifier, which now evaluates include rules that can override exclude rules when classifying node precision. Documentation, changelog, and unit tests updated.

Changes

Cohort / File(s) Summary
Changelog & Guide
CHANGELOG.rst, docs/source/guides/8_autocast.rst
Documents new nodes_to_include and op_types_to_include options, updates API examples and "How It Works" rules, and clarifies precision wording (FP16/BF16) and classification notes (data_max/init_max).
CLI
modelopt/onnx/autocast/__main__.py
Adds --nodes_to_include/-ni and --op_types_to_include/-opi multi-value options; updates IO-type help text to mention reduced precision; forwards include lists to convert_to_mixed_precision.
Conversion API
modelopt/onnx/autocast/convert.py
Extends convert_to_mixed_precision(...) signature with nodes_to_include and op_types_to_include, documents them, defaults to empty lists, and passes them into NodeClassifier.
Classifier Core
modelopt/onnx/autocast/nodeclassifier.py
Adds include-rule classes (IncludeNodeNameRegexRule, IncludeOpTypes); introduces nodes_to_include / op_types_to_include fields; splits rule generation into _gen_exclude_node_rules and _gen_include_node_rules; updates run logic so include matches override exclude matches when deciding precision.
Tests
tests/unit/onnx/autocast/test_nodeclassifier.py
Adds test_node_classifier_force_include to verify nodes_to_include/op_types_to_include force nodes into low precision and that include rules can override exclude rules.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User/Script
  participant CLI as AutoCast CLI
  participant Convert as convert_to_mixed_precision
  participant NC as NodeClassifier
  participant ONNX as ONNX Model

  U->>CLI: autocast --nodes_to_include ... --op_types_to_include ...
  CLI->>Convert: convert_to_mixed_precision(..., nodes_to_include, op_types_to_include)
  Convert->>NC: NodeClassifier(..., nodes_to_include, op_types_to_include)
  Convert->>NC: run(model)
  loop For each node
    NC->>NC: Evaluate exclude rules
    NC->>NC: Evaluate include rules
    alt Matches exclude AND not matches include
      NC-->>Convert: classify node as High precision (keep FP32)
    else Otherwise (include matched or not excluded)
      NC-->>Convert: classify node as Low precision (cast to FP16/BF16)
    end
  end
  Convert->>ONNX: Apply casts per classification
  Convert-->>CLI: Return mixed-precision model
  CLI-->>U: Output model/report

  note over NC: Include rules take precedence and can force low precision
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I nudge the graph with velvet paws,
Include, exclude—such tidy laws.
A hop to force the bits to small,
I coax some nodes to shrink their all.
Carrots count in FP16, hooray! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely and accurately conveys the primary change of the pull request by stating that AutoCast is gaining new options to force-include specific nodes or operators in F16 conversion, making it clear and specific enough for team members to understand the main feature being introduced.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ac7868 and d71399d.

📒 Files selected for processing (1)
  • tests/unit/onnx/autocast/test_nodeclassifier.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/onnx/autocast/test_nodeclassifier.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: linux
  • GitHub Check: wait-checks / wait
  • GitHub Check: code-quality
  • GitHub Check: build-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

codecov bot commented Sep 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.88%. Comparing base (615f3c0) to head (d71399d).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #386      +/-   ##
==========================================
+ Coverage   73.86%   73.88%   +0.02%     
==========================================
  Files         171      171              
  Lines       17629    17645      +16     
==========================================
+ Hits        13021    13037      +16     
  Misses       4608     4608              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add options nodes_to_include, op_types_to_include that force-include nodes in the conversion, overriding NodeClassifier exclusion logic

Signed-off-by: Gal Hubara Agam <[email protected]>
@galagam galagam force-pushed the dev-gagam-force-input-nodes branch from be7614f to 1ac7868 Compare September 28, 2025 16:07
Signed-off-by: Gal Hubara Agam <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant