Skip to content

Fix model copying for QDQ stripping #784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: ovep-develop
Choose a base branch
from

Conversation

mklimenk
Copy link

Description

This PR fixes the model copying portion required by QDQ stripping for GPU and bfloat16->float16 conversions. The copying was broken by the converting the initializers to OrtValues in the upstream repo.

To make the changes in a single place, a PR #768 with a small fix was embedded in this PR.

https://jira.devtools.intel.com/browse/CVS-171536

@mklimenk
Copy link
Author

@ankitm3k @sfatimar could you please review it?

Copilot

This comment was marked as outdated.

@ankitm3k ankitm3k requested a review from Copilot August 21, 2025 16:21
Copy link

@ankitm3k ankitm3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes model copying functionality required for QDQ (Quantize/Dequantize) stripping when processing GPU models with bfloat16->float16 conversions and 16-bit integer quantization. The changes update the model copying mechanism to handle OrtValue initializers and refine the conditions for QDQ optimization.

  • Updated model copying logic to use OrtValue-based initializer handling instead of TensorProto copying
  • Added support for INT16/UINT16 data types in type checking and moved the logic from experimental GPU-only to general support
  • Enhanced QDQ graph detection to specifically identify graphs with 16-bit quantization for targeted GPU optimization

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
qdq_scales_fix.cpp Simplified model copying to use OrtValue initializers and removed redundant tensor proto copying logic
data_ops.cc Moved INT16/UINT16 type support from experimental GPU-only section to general type support
backend_manager.cc Added detection for QDQ graphs with 16-bit quantization and refined GPU optimization conditions

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@ankitm3k ankitm3k requested a review from MayureshV1 August 22, 2025 06:28
Copy link

@MayureshV1 MayureshV1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !
Implication of this PR is limited to GPU and only to INT16, UINT16 and BF16.
Waiting on confirmation from validation of GPU customer models before we merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants