Recommendation for enabling MXLinear on XPU backends #2120

slabhs-aws · 2025-12-06T06:24:48Z

slabhs-aws
Dec 6, 2025

Hi team,

Currently MXLinear runs only when CUDA capability ≥ SM100, due to:

assert has_cuda_capability(10, 0)

This makes MXLinear inaccessible on XPUs or other non-SM100 accelerator paths.

What would be the recommended strategy for enabling MXLinear usage on XPU-based backends?

This will help ensure that frameworks relying on MXLinear can onboard non-CUDA backends without diverging from intended usage patterns.

Related question raised in TorchAO: pytorch/ao#3457

Thanks!

tianyu-l · 2025-12-07T08:40:59Z

tianyu-l
Dec 7, 2025
Collaborator

Is MXLinear already supported on XPU today? If so we could start with adding another utils function similar to https://github.com/pytorch/torchtitan/blob/main/torchtitan/tools/utils.py#L20 to unblock.

0 replies

HahTK · 2025-12-08T18:36:43Z

HahTK
Dec 8, 2025

@tianyu-l : Trainium 3 has support for MXFP8 and MXFP4 (see recent announcement : https://aws.amazon.com/ai/machine-learning/trainium/).

Edit : Technically not an Intel XPU but backend extensions could be generic.

For internal prototypes we have already implemented a similar function. We were never blocked and are not concerned about short term progress. In fact, the question here is the result of our internal review of the prototype

IMHO, we would generally want to avoid such a list of "has_X_capability" for every XPU which is check in multiple places. That seems error prone and not scalable. The question is really : what is the long term design direction here? Is this really it or are we already planning something else?

It would seem like one possible solution is to create a registry for different backends to register ops, customizations and what not. Does the TorchAO team have something else in mind?

1 reply

tianyu-l Dec 9, 2025
Collaborator

If it's about reusage of torchao code across different platforms, @danielvegamyhre. It's fine as long as a general "platform unsupported" failure message is available. Worst case we can keep a whitelist.

If it's about registering custom quantization implementations (other than torchao), current mechanism is ModelConverter in torchtitan.

If it's about registering ops, see TORCH_LIBRARY.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recommendation for enabling MXLinear on XPU backends #2120

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Recommendation for enabling MXLinear on XPU backends #2120

Uh oh!

Uh oh!

slabhs-aws Dec 6, 2025

Replies: 2 comments · 1 reply

Uh oh!

tianyu-l Dec 7, 2025 Collaborator

Uh oh!

Uh oh!

HahTK Dec 8, 2025

Uh oh!

tianyu-l Dec 9, 2025 Collaborator

slabhs-aws
Dec 6, 2025

Replies: 2 comments 1 reply

tianyu-l
Dec 7, 2025
Collaborator

HahTK
Dec 8, 2025

tianyu-l Dec 9, 2025
Collaborator