Avoid fp32 cast for Torch div operator #2241

HennerM · 2024-06-17T14:43:41Z

The div Torch op was always casting both operands to fp32, even if both operands are of type fp16. This cast should get removed by the "common::add_fp16_cast" optimization pass. However, it causes issues during the PyTorch conversion, for example let's say we have a forward method like this:

class Foo:
    def __init__(self):
        super().__init__()
        self.proj = torch.nn.Linear(16, 1)
    def forward(self, x, y): # both fp16 tensors, shape [1, 16]
        r = x / y # r is now fp32
        return self.proj(r) # Problem

Now if we have moved the model (and it's parameters) to fp16 with eg. m = Foo().to(torch.float16), we get an error at conversion time:

In op, of type linear, named linear_0, the named input bias must have the same data type as the named input x. However, bias has dtype fp16 whereas x has dtype fp32.

This is because the result of the div operation stays fp32, and this doesn't match the resulting type of the PyTorch expression.

The `div` Torch op was always casting both operands to fp32, even if both operands are of type fp16. This cast should get removed by the `"common::add_fp16_cast"` optimization pass. However, it causes issues during the PyTorch conversion, for example let's say we have a forward method like this: ```python class Foo: def __init__(self): super().__init__() self.proj = torch.nn.Linear(16, 1) def forward(self, x, y): # both fp16 tensors, shape [1, 16] r = x / y # r is now fp32 return self.proj(r) # Problem ``` Now if we have moved the model (and it's parameters) to fp16 with eg. `m = Foo().to(torch.float16)`, we get an error at conversion time: > In op, of type linear, named linear_0, the named input `bias` must have the same data type as the named input `x`. However, bias has dtype fp16 whereas x has dtype fp32. This is because the result of the `div` operation stays fp32, and this doesn't match the resulting type of the PyTorch expression.

TobyRoseman · 2024-06-17T16:44:55Z

Please add a unit test to test_torch_ops.py which fails without your fix but passes with your fix.

jeethu · 2024-07-11T17:59:02Z

I stumbled across the same issue and managed to debug it. It turns out the root cause isn't the div op. This happens because the torch converter casts inputs to fp32 here.

Here's a minimal repro that does not use the div op and still fails with the same error:

import coremltools as ct
import numpy as np
import torch


class Net(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.proj = torch.nn.Linear(16, 1)

    def forward(self, x):
        return self.proj(x)


x = torch.randn(1, 16, dtype=torch.float16)

with torch.no_grad():
    mlmodel = ct.convert(
        torch.jit.trace(Net().half().eval(), x),
        inputs=[ct.TensorType(name="x", shape=x.shape, dtype=np.float16)],
        outputs=[ct.TensorType(name="output")],
        convert_to="mlprogram",
        compute_precision=ct.precision.FLOAT16,
        minimum_deployment_target=ct.target.iOS17,
    )

This fails with the same exception as the snippet above with the div op:

ValueError: In op, of type linear, named linear_0, the named input `bias` must have the same data type as the named input `weight`. However, bias has dtype fp16 whereas weight has dtype fp32.

I've got a fix for this in #2274.

YifanShenSZ · 2024-07-26T18:07:10Z

Hi @HennerM, inputs=[ct.TensorType(dtype=np.float16)] and compute_precision=ct.precision.FLOAT16 are enough to obtain a fp16-input fp16-computation Core ML model. There is no need to make the PyTorch model itself fp16

Concretely, internally we translate torch model in fp32. Then,

If given compute_precision=ct.precision.FLOAT16, we will insert fp16 casts to make computation (i.e. weight & activation) fp16
If given inputs=[ct.TensorType(name="x", shape=x.shape, dtype=np.float16)], we will change input signature for x to fp16

HennerM added 3 commits June 17, 2024 14:52

add node

3a0f49f

use named arguments

bf37119

jeethu mentioned this pull request Jul 12, 2024

Skip casting model inputs to fp32 if weights and inputs are all fp16 #2274

Open

YifanShenSZ closed this Sep 6, 2024

YifanShenSZ mentioned this pull request Jan 9, 2025

[Torch] Support FP16 Conversion #2429

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid fp32 cast for Torch div operator #2241

Avoid fp32 cast for Torch div operator #2241

HennerM commented Jun 17, 2024

TobyRoseman commented Jun 17, 2024

jeethu commented Jul 11, 2024

YifanShenSZ commented Jul 26, 2024

Avoid fp32 cast for Torch div operator #2241

Avoid fp32 cast for Torch div operator #2241

Conversation

HennerM commented Jun 17, 2024

TobyRoseman commented Jun 17, 2024

jeethu commented Jul 11, 2024

YifanShenSZ commented Jul 26, 2024