Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xla:cpu] Disable SVE LLVM codegen by default on AArch64 CPUs. #23931

Merged
merged 1 commit into from
Mar 20, 2025

Conversation

copybara-service[bot]
Copy link

[xla:cpu] Disable SVE LLVM codegen by default on AArch64 CPUs.

There are many missing SVE lowerings in LLVM, especially for bf16 type. This causes program termination on SVE-available machines such as Google Axion. Example error:

LLVM ERROR: Cannot select: 0xf7eb74164950: nxv4bf16 = AArch64ISD::UINT_TO_FP_MERGE_PASSTHRU 0xf7eb74186d60, 0xf7eb74186200, undef:nxv4bf16
  0xf7eb74186d60: nxv4i1 = AArch64ISD::PTRUE TargetConstant:i32<31>
    0xf7eb74191580: i32 = TargetConstant<31>
  0xf7eb74186200: nxv4i32,ch = load<(invariant load (<vscale x 1 x s32>) from %ir.scevgep16, !noalias !5), zext from nxv4i8> 0xf7eb740d8830, 0xf7eb74194070, undef:i64
    0xf7eb74194070: i64 = add 0xf7eb741922f0, 0xf7eb74193c10
      0xf7eb741922f0: i64,ch = CopyFromReg 0xf7eb740d8830, Register:i64 %7
        0xf7eb74191e90: i64 = Register %7
      0xf7eb74193c10: i64,ch = CopyFromReg 0xf7eb740d8830, Register:i64 %11
        0xf7eb74192750: i64 = Register %11
    0xf7eb74186350: i64 = undef
  0xf7eb74164a30: nxv4bf16 = undef
In function: convert.2_kernel
Fatal Python error: Aborted

Since most AArch64 machines still use 128-bit registers, SVE and NEON shouldn't have significant performance difference, so we disable SVE codegen in public builds for the time being.

After JAX uses an XLA commit that has changes from this PR, the following JAX tests will pass on Axion:

bazel test //tests:shape_poly_test_cpu --test_filter=*test_harness_vmap_convert_element_type_dtypes_to_dtypes_shape_bool_100_100_olddtype_bool_newdtype_bfloat16*

bazel test //tests:export_harnesses_multi_platform_test_cpu --test_filter=*test_prim_convert_element_type_dtypes_to_dtypes_shape_uint8_100_100_olddtype_uint8_newdtype_bfloat16* 

Add --test_env=XLA_FLAGS=--xla_cpu_max_isa="" to the options to get the errors back.

@copybara-service copybara-service bot force-pushed the test_738363714 branch 6 times, most recently from fcd3059 to 137a836 Compare March 20, 2025 06:58
There are many missing SVE lowerings in LLVM, especially for bf16 type. This causes program termination on SVE-available machines such as Google Axion. Example error:
```
LLVM ERROR: Cannot select: 0xf7eb74164950: nxv4bf16 = AArch64ISD::UINT_TO_FP_MERGE_PASSTHRU 0xf7eb74186d60, 0xf7eb74186200, undef:nxv4bf16
  0xf7eb74186d60: nxv4i1 = AArch64ISD::PTRUE TargetConstant:i32<31>
    0xf7eb74191580: i32 = TargetConstant<31>
  0xf7eb74186200: nxv4i32,ch = load<(invariant load (<vscale x 1 x s32>) from %ir.scevgep16, !noalias !5), zext from nxv4i8> 0xf7eb740d8830, 0xf7eb74194070, undef:i64
    0xf7eb74194070: i64 = add 0xf7eb741922f0, 0xf7eb74193c10
      0xf7eb741922f0: i64,ch = CopyFromReg 0xf7eb740d8830, Register:i64 %7
        0xf7eb74191e90: i64 = Register %7
      0xf7eb74193c10: i64,ch = CopyFromReg 0xf7eb740d8830, Register:i64 %11
        0xf7eb74192750: i64 = Register %11
    0xf7eb74186350: i64 = undef
  0xf7eb74164a30: nxv4bf16 = undef
In function: convert.2_kernel
Fatal Python error: Aborted
```

Since most AArch64 machines still use 128-bit registers, SVE and NEON shouldn't have significant performance difference, so we disable SVE codegen in public builds for the time being.

After JAX uses an XLA commit that has changes from this PR, the following JAX tests will pass on Axion:
```
bazel test //tests:shape_poly_test_cpu --test_filter=*test_harness_vmap_convert_element_type_dtypes_to_dtypes_shape_bool_100_100_olddtype_bool_newdtype_bfloat16*

bazel test //tests:export_harnesses_multi_platform_test_cpu --test_filter=*test_prim_convert_element_type_dtypes_to_dtypes_shape_uint8_100_100_olddtype_uint8_newdtype_bfloat16*
```

Add `--test_env=XLA_FLAGS=--xla_cpu_max_isa=""` to the options to get the errors back.

PiperOrigin-RevId: 738695223
@copybara-service copybara-service bot merged commit 2d29dca into main Mar 20, 2025
@copybara-service copybara-service bot deleted the test_738363714 branch March 20, 2025 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant