prototype float8 training quantizeable mm module #2807

vkuzo · 2025-08-19T14:30:30Z

Summary:

Adds a _QuantizeableMM module, and a _Float8MM quantized version it. This is to enable quantizing calls to torch.mm where none of the inputs are weights. This requires modeling changes at the callsite.

For now adding this as a prototype with names underscored to help test out a customer need. We can make the API more official at a later time after we get more signal on product market fit.

Usage:

from torchao.float8.quantizeable_mm import _QuantizeableMM

class M(nn.Module):
    def __init__(self):
        super().__init__()
        self.mm = _QuantizeableMM()

    def forward(self, a, b):
        c = self.mm(a, b)
        return c

config = Float8LinearConfig.from_recipe_name(recipe_name)
m_ref = M()
m = copy.deepcopy(m_ref)
m = convert_to_float8_training(m, config=config)

Test Plan:

pytest test/float8/test_base.py -s -x -k quantizeable_mm

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Adds a `_QuantizeableMM` module, and a `_Float8MM` quantized version it. This is to enable quantizing calls to `torch.mm` where none of the inputs are weights. This requires modeling changes at the callsite. For now adding this as a prototype with names underscored to help test out a customer need. We can make the API more official at a later time after we get more signal on product market fit. Usage: ```python class M(nn.Module): def __init__(self): super().__init__() self.mm = _QuantizeableMM() def forward(self, a, b): c = self.mm(a, b) return c config = Float8LinearConfig.from_recipe_name(recipe_name) m_ref = M() m = copy.deepcopy(m_ref) m = convert_to_float8_training(m, config=config) ``` Test Plan: ```bash pytest test/float8/test_base.py -s -x -k quantizeable_mm ``` Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2025-08-19T14:30:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2807

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit 72e6b38 with merge base 751d7f6 ():

NEW FAILURES - The following jobs have failed:

Run 1xH100 Tests / test (H100, linux.aws.h100, --pre torch torchvision torchaudio fbgemm-gpu-genai --index-url https... / linux-job (gh)
test/integration/test_integration.py::TestAutoQuant::test_autoquant_hp_float
Run 1xL4 Tests / test (SM-89, linux.g6.4xlarge.experimental.nvidia.gpu, --pre torch --index-url https://download.p... / linux-job (gh)
test/integration/test_integration.py::TestAutoQuant::test_autoquant_hp_float
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/integration/test_integration.py::TestAutoQuant::test_autoquant_hp_float

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-19T14:32:12Z

@vkuzo has imported this pull request. If you are a Meta employee, you can view this in D80536689.

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 19, 2025

vkuzo added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Aug 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

prototype float8 training quantizeable mm module #2807

prototype float8 training quantizeable mm module #2807

Uh oh!

vkuzo commented Aug 19, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 19, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Aug 19, 2025

Uh oh!

Uh oh!

prototype float8 training quantizeable mm module #2807

Are you sure you want to change the base?

prototype float8 training quantizeable mm module #2807

Uh oh!

Conversation

vkuzo commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2807

❌ 3 New Failures

Uh oh!

facebook-github-bot commented Aug 19, 2025

Uh oh!

Uh oh!

vkuzo commented Aug 19, 2025 •

edited

Loading

pytorch-bot bot commented Aug 19, 2025 •

edited

Loading