[Backend][Relax] Add Intel GNA backend for NPU support #18201

Aristide021 · 2025-08-09T20:47:55Z

Intel GNA (Gaussian Neural Accelerator) backend for TVM Relax, designed as a foundation for Intel NPU support. While GNA hardware is present in Intel Core Ultra processors, this backend serves as a stepping stone toward Intel's current NPU path with OpenVINO runtime integration.

Features:

Pattern-based graph partitioning for GNA/NPU-compatible operations
JSON serialization approach enabling seamless NPU migration
Software emulation mode for testing without dedicated hardware
Support for dense/linear, 1D convolution, and ReLU operations
Automatic shape and dtype extraction for optimization
Comprehensive test coverage with CI integration

Supported operations:

Dense/Linear layers (relax.matmul)
1D Convolution (relax.nn.conv1d)
ReLU activation (relax.nn.relu)

This implementation provides a clean, minimal pattern for backend development while preparing the foundation for Intel's recommended NPU acceleration path through TVM's compilation pipeline.

mshr-h · 2025-08-21T13:01:17Z

@Aristide021
Thanks for the PR! A couple of points and questions:

Status of GNA vs NPU
- The upstream GNA repo is archived and marked as not under active management. The OpenVINO docs also note that GNA is being discontinued and recommend using Intel's NPU as the low-power offload path on newer processors. Given that, would it make sense to position this backend as a stepping stone toward NPU (and/or clarify the long-term maintenance plan in the README/code comments)?
- https://github.com/intel/gna
- https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_supported_plugins_GNA.html
CI & Software Emulation Mode
- According to the OpenVINO docs, GNA plugin supports Software Emulation Mode (CPU fallback) when GNA HW isn't present. If we enable that in tests, we could run E2E coverage in our CI.

I also think this backend can serve as a very good example for codegen in Relax. It shows a clean and minimal pattern: partitioning with basic ops, handing off to JSON, and keeping the implementation relatively lightweight. Adding a short HOWTO or developer note ("Writing a minimal Relax backend") that references this code could be very helpful for the community.

cc @tqchen @Hzfengsy @cbalint13

Aristide021 · 2025-08-22T18:54:43Z

@Aristide021 Thanks for the PR! A couple of points and questions:

Status of GNA vs NPU

The upstream GNA repo is archived and marked as not under active management. The OpenVINO docs also note that GNA is being discontinued and recommend using Intel's NPU as the low-power offload path on newer processors. Given that, would it make sense to position this backend as a stepping stone toward NPU (and/or clarify the long-term maintenance plan in the README/code comments)?

https://github.com/intel/gna

https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_supported_plugins_GNA.html

CI & Software Emulation Mode

According to the OpenVINO docs, GNA plugin supports Software Emulation Mode (CPU fallback) when GNA HW isn't present. If we enable that in tests, we could run E2E coverage in our CI.

I also think this backend can serve as a very good example for codegen in Relax. It shows a clean and minimal pattern: partitioning with basic ops, handing off to JSON, and keeping the implementation relatively lightweight. Adding a short HOWTO or developer note ("Writing a minimal Relax backend") that references this code could be very helpful for the community.

cc @tqchen @Hzfengsy @cbalint13

Thanks for the review and the excellent points! You're correct about GNA being archived. I designed this backend as a stepping stone toward NPU support with OpenVINO runtime integration in mind. The JSON serialization approach should make the transition to Intel's current NPU path relatively straightforward.

For the CI integration with Software Emulation Mode, I think that's a great suggestion. I can add CPU fallback support to enable E2E testing without requiring actual GNA hardware.

I'd also be happy to add documentation, positioning this as a foundation for NPU backends, and include a developer guide if that would be helpful for the community.

I'll go ahead and update the PR description to clarify the NPU migration path. My next step will be to add CPU emulation support for testing. Please let me know if you have any other suggestions.

This commit introduces the Intel GNA (Gaussian Neural Accelerator) backend for TVM's Relax IR with a clean separation between hardware and emulation runtimes to enable CI testing without GNA hardware. Key components: - GNA codegen for Relax IR (graph partitioning and code generation) - Hardware runtime (gna_json_runtime.cc) for systems with GNA SDK - CPU emulation runtime (gna_json_runtime_emulation.cc) for CI/testing - Conditional CMake build based on GNA SDK availability - Pattern registry for dense, conv1d, and relu operations - Comprehensive test suite Architecture decisions: - Clean separation: Hardware and emulation in separate files (no mocking) - CI-friendly: Emulation runtime has no GNA SDK dependencies - Follows OpenVINO's Software Emulation Mode pattern - Same API surface for both runtime implementations The emulation runtime provides simplified reference implementations sufficient for testing graph partitioning and codegen correctness. For production CPU inference, use TVM's standard CPU backend. This backend serves as a stepping stone toward Intel NPU support and provides a minimal example for Relax backend development.

tqchen · 2025-08-24T16:12:05Z

Thanks for the contribution, given GNA is archived, it perhaps does not make sense to maintain it in the main tree, adding ci will also add extra overhead here. However, i agree that having generic tutorials for BYOC NPU would be useful, if we can have something that support a current NPU that would be great

Aristide021 · 2025-08-24T17:25:38Z

Thanks for the contribution, given GNA is archived, it perhaps does not make sense to maintain it in the main tree, adding ci will also add extra overhead here. However, i agree that having generic tutorials for BYOC NPU would be useful, if we can have something that support a current NPU that would be great

I'd be happy to refactor this into a generic NPU tutorial targeting Intel's current NPU plugin. Should this live in the tutorials section or as a contrib module? I can adapt the JSON architecture for educational purposes.

tqchen · 2025-08-24T19:47:55Z

i think starting as contrib is fine, and we can have a tutorial explaination point to the code

This commit introduces an educational NPU backend example that teaches key architectural concepts common across Neural Processing Units. Key features: - Multi-tier memory hierarchy (L0/L1/L2/L3) management with spilling - Tiling engine for large tensors that exceed on-chip SRAM - Quantization support (INT8/INT16) with dedicated patterns - Multiple execution engines (matrix, vector, conv, pooling, activation) - Operation fusion patterns to reduce memory traffic - Power mode management for efficiency tuning Educational value: - Demonstrates NPU memory management strategies - Shows how tiling enables large model execution - Explains quantization's role in NPU acceleration - Illustrates operation-to-engine mapping - Provides CPU emulation for testing without hardware This vendor-neutral implementation serves as a template for developers creating custom NPU backends, teaching BYOC integration patterns while demonstrating real NPU architectural concepts. Addresses feedback from apache#18201 requesting generic NPU BYOC tutorials.

…cepts This commit introduces a vendor-neutral NPU backend that demonstrates architectural patterns common across Neural Processing Units. The implementation covers key NPU concepts including multi-tier memory hierarchy management, automatic tiling for large tensors, quantization handling, and specialized execution engines. It shows how NPUs manage memory across different tiers (L0/L1/L2/L3), tile operations to fit in on-chip SRAM, and dispatch operations to dedicated compute units. This serves as an educational template for developers creating NPU backends, demonstrating BYOC integration while teaching NPU-specific optimization strategies. Uses CPU emulation for testing without requiring actual NPU hardware. Addresses feedback from apache#18201 requesting generic NPU BYOC tutorials.

Aristide021 marked this pull request as draft August 10, 2025 11:54

Aristide021 force-pushed the feature/gna_codegen branch 5 times, most recently from 141157b to 77b312a Compare August 11, 2025 19:52

Aristide021 marked this pull request as ready for review August 11, 2025 19:58

Aristide021 force-pushed the feature/gna_codegen branch 6 times, most recently from 9b955d4 to 2c036cc Compare August 23, 2025 19:42

Aristide021 force-pushed the feature/gna_codegen branch from 2c036cc to 7d5d812 Compare August 23, 2025 21:24

Aristide021 mentioned this pull request Aug 28, 2025

[Backend][Relax] Add NPU BYOC backend example #18247

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Backend][Relax] Add Intel GNA backend for NPU support #18201

[Backend][Relax] Add Intel GNA backend for NPU support #18201

Uh oh!

Aristide021 commented Aug 9, 2025 •

edited

Loading

Uh oh!

mshr-h commented Aug 21, 2025 •

edited

Loading

Uh oh!

Aristide021 commented Aug 22, 2025

Uh oh!

tqchen commented Aug 24, 2025

Uh oh!

Aristide021 commented Aug 24, 2025

Uh oh!

tqchen commented Aug 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Backend][Relax] Add Intel GNA backend for NPU support #18201

Are you sure you want to change the base?

[Backend][Relax] Add Intel GNA backend for NPU support #18201

Uh oh!

Conversation

Aristide021 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshr-h commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Aristide021 commented Aug 22, 2025

Uh oh!

tqchen commented Aug 24, 2025

Uh oh!

Aristide021 commented Aug 24, 2025

Uh oh!

tqchen commented Aug 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Aristide021 commented Aug 9, 2025 •

edited

Loading

mshr-h commented Aug 21, 2025 •

edited

Loading