Some simplification patterns can create invalid IR due to type changes #2634

christopherbate · 2024-11-21T00:57:17Z

What happened?

There are several places in the simplification transforms (e.g. https://github.com/openxla/stablehlo/blob/main/stablehlo/transforms/StablehloAggressiveFolder.cpp#L744) where a rewrite pattern unconditionally replaces a Value with a Value of a potentially different type. Sometimes this is OK, sometimes not.

In this IR, folding the stablehlo.subtract and replacing it directly with a constant of type tensor<1x1xi32> is OK:

func.func @example() -> tensor<?x?xi32> {
    %c = stablehlo.constant dense<1> : tensor<1x1xi32>
    %c1 = stablehlo.constant dense<2> : tensor<1x1xi32>
    %0 = stablehlo.subtract %c, %c1 : (tensor<1x1xi32>, tensor<1x1xi32>) -> tensor<?x?xi32>
    %1 = stablehlo.negate %0 : (tensor<?x?xi32>) -> tensor<?x?xi32>
    return %1 : tensor<?x?xi32>
}

However, if the user of %0 is a return or something like stablehlo.composite, then the result is invalid IR.

It seems like it's hit-or-miss whether patterns are checking and tested against edge cases similar to the above.

Rather than submitting PRs piecemeal each time we encounter a failing pattern, I'm wondering if there's something we can do across the board to address this issue in all the simplification passes.

The text was updated successfully, but these errors were encountered:

Added a patch that was missing from the last StableHLO upgrade. This patch addresses some issues mentioned in openxla/stablehlo#2634. An additional test is added to mlir-tensorrt as a regression test.

Added a patch that was missing from the last StableHLO upgrade. This patch addresses some issues mentioned in openxla/stablehlo#2634. An additional test is added to mlir-tensorrt as a regression test. Also makes some minor cleanup to CI/CD config in order to fix the caching mechanism, using the PR as a test-case.

Added a patch that was missing from the last StableHLO upgrade. This patch addresses some issues mentioned in openxla/stablehlo#2634. An additional test is added to mlir-tensorrt as a regression test.

GleasonK · 2025-02-06T19:35:01Z

Thought I replied to this in the past -- Is this example something you encounter in practice? Curious what the source of the IR is, as it isn't often we see less-precise result types than input types. This IR in particular is something that could be fixed with basic type inference.

Also I'm see your patches, feel free to upstream those! They look reasonable.

christopherbate · 2025-02-15T19:01:58Z

@GleasonK yes we see it in practice. It's a problem because although Stablehlo has a set of rules that allow updating types in-place, that doesn't apply to other ops that are used like "func.return" or even certain Stablehlo ops that have restrictions on the operand types likes dynamic slice or the composite op.

That then implies that if you want to update a type in-place, you have to iterate over all users to check whether in-place update is valid. It might be cheaper just to insert a cast-like operation after all if there are more than 2 or so uses, just to avoid all the pointer chasing. I've never measured the cost though.

The reason you might not observe the issue is perhaps due to the front end... we have an ongoing experiment where the front end creates Stablehlo in the most general type possible, relying on the MLIR compiler to refine the types. That's where we encounter these sorts of issues/bugs.

christopherbate mentioned this issue Dec 4, 2024

[mlir-tensorrt] Add additional missing StableHLO patch NVIDIA/TensorRT-Incubator#425

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some simplification patterns can create invalid IR due to type changes #2634

Some simplification patterns can create invalid IR due to type changes #2634

christopherbate commented Nov 21, 2024 •

edited

Loading

GleasonK commented Feb 6, 2025

christopherbate commented Feb 15, 2025

Some simplification patterns can create invalid IR due to type changes #2634

Some simplification patterns can create invalid IR due to type changes #2634

Comments

christopherbate commented Nov 21, 2024 • edited Loading

What happened?

GleasonK commented Feb 6, 2025

christopherbate commented Feb 15, 2025

christopherbate commented Nov 21, 2024 •

edited

Loading