Skip to content

[webgpu] use u32 to represent f16 in uniform #25391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Jul 14, 2025

Description

For f16 uniform variables, use u32 to bit-wise represent them.

Motivation and Context

Some devices supports f16 in shader/storage buffer, but not in uniform buffers. Dawn will set the f16_support to false for them. However, we don't necessarily have to use f16 in uniform.

This change together with #25349 will enable using f16 models on some Android devices.

@fs-eire fs-eire requested review from Copilot and vraspar July 14, 2025 21:28
Copilot

This comment was marked as outdated.

@fs-eire fs-eire requested a review from Copilot July 14, 2025 22:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates how 16-bit floats are represented in uniform buffers by packing them into 32-bit unsigned integers and adjusts buffer layout computation and WGSL codegen accordingly.

  • Compute size and alignment for f16 uniforms as u32-backed containers in webgpu_context.cc.
  • Generate WGSL bitcast expressions in GetElementAt for f16 access in shader_variable.h.
  • Change uniform struct declarations to use u32 types and adjusted lengths in shader_helper.cc.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
onnxruntime/core/providers/webgpu/webgpu_context.cc Revised uniform offset/size logic to handle f16 as u32 with correct alignment and padding.
onnxruntime/core/providers/webgpu/shader_variable.h Expanded GetElementAt to emit bitcast<vec2<f16>> accesses for f16 uniforms.
onnxruntime/core/providers/webgpu/shader_helper.cc Mutates data_type/length for f16 and emits appropriate WGSL types (u32, vecN, array).
Comments suppressed due to low confidence (1)

onnxruntime/core/providers/webgpu/webgpu_context.cc:381

  • The comment for the f16 array threshold (>8) does not match the implementation (which uses length > 6). Please update the comment to reflect the actual branch condition or adjust the threshold to align with the intended behavior.
    // - length > 8      : array<vec4<u32>, N>   (align 16) (size 16 * N, N = ceil(length / 8))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants