-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[RISCV] Add short forward branch support for min, max, maxu and minu
#164394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Change-Id: I7585f98422bf4b101fd44b1b4d6bc8584ca8cb53
Change-Id: Ibf73c7c9b42a4942a4baa18818cf98cf2916f199
Change-Id: I5a2deafae906b518f3379b2c4ba625cf0a76df79
|
@llvm/pr-subscribers-backend-risc-v Author: quic_hchandel (hchandel) ChangesPatch is 22.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164394.diff 4 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
index 410561855e181..567a8da50a1db 100644
--- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
@@ -127,6 +127,10 @@ bool RISCVExpandPseudo::expandMI(MachineBasicBlock &MBB,
case RISCV::PseudoCCAND:
case RISCV::PseudoCCOR:
case RISCV::PseudoCCXOR:
+ case RISCV::PseudoCCMAX:
+ case RISCV::PseudoCCMAXU:
+ case RISCV::PseudoCCMIN:
+ case RISCV::PseudoCCMINU:
case RISCV::PseudoCCADDW:
case RISCV::PseudoCCSUBW:
case RISCV::PseudoCCSLL:
@@ -228,6 +232,10 @@ bool RISCVExpandPseudo::expandCCOp(MachineBasicBlock &MBB,
case RISCV::PseudoCCAND: NewOpc = RISCV::AND; break;
case RISCV::PseudoCCOR: NewOpc = RISCV::OR; break;
case RISCV::PseudoCCXOR: NewOpc = RISCV::XOR; break;
+ case RISCV::PseudoCCMAX: NewOpc = RISCV::MAX; break;
+ case RISCV::PseudoCCMIN: NewOpc = RISCV::MIN; break;
+ case RISCV::PseudoCCMAXU: NewOpc = RISCV::MAXU; break;
+ case RISCV::PseudoCCMINU: NewOpc = RISCV::MINU; break;
case RISCV::PseudoCCADDI: NewOpc = RISCV::ADDI; break;
case RISCV::PseudoCCSLLI: NewOpc = RISCV::SLLI; break;
case RISCV::PseudoCCSRLI: NewOpc = RISCV::SRLI; break;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index ddb53a2ce62b3..435df1e4b91b6 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -1698,6 +1698,10 @@ unsigned getPredicatedOpcode(unsigned Opcode) {
case RISCV::AND: return RISCV::PseudoCCAND; break;
case RISCV::OR: return RISCV::PseudoCCOR; break;
case RISCV::XOR: return RISCV::PseudoCCXOR; break;
+ case RISCV::MAX: return RISCV::PseudoCCMAX; break;
+ case RISCV::MAXU: return RISCV::PseudoCCMAXU; break;
+ case RISCV::MIN: return RISCV::PseudoCCMIN; break;
+ case RISCV::MINU: return RISCV::PseudoCCMINU; break;
case RISCV::ADDI: return RISCV::PseudoCCADDI; break;
case RISCV::SLLI: return RISCV::PseudoCCSLLI; break;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td b/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td
index 0114fbdc56302..5a67a5aaba293 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td
@@ -106,6 +106,10 @@ def PseudoCCSRA : SFBALU_rr;
def PseudoCCAND : SFBALU_rr;
def PseudoCCOR : SFBALU_rr;
def PseudoCCXOR : SFBALU_rr;
+def PseudoCCMAX : SFBALU_rr;
+def PseudoCCMIN : SFBALU_rr;
+def PseudoCCMAXU : SFBALU_rr;
+def PseudoCCMINU : SFBALU_rr;
def PseudoCCADDI : SFBALU_ri;
def PseudoCCANDI : SFBALU_ri;
diff --git a/llvm/test/CodeGen/RISCV/short-forward-branch-opt-min-max.ll b/llvm/test/CodeGen/RISCV/short-forward-branch-opt-min-max.ll
new file mode 100644
index 0000000000000..9fa4e350aced9
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/short-forward-branch-opt-min-max.ll
@@ -0,0 +1,539 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc < %s -mtriple=riscv32 | FileCheck %s --check-prefixes=RV32I-NOZBB
+; RUN: llc < %s -mtriple=riscv64 | FileCheck %s --check-prefixes=RV64I-NOZBB
+; RUN: llc < %s -mtriple=riscv32 -mattr=+zbb,+short-forward-branch-opt | \
+; RUN: FileCheck %s --check-prefixes=RV32I-SFB-ZBB
+; RUN: llc < %s -mtriple=riscv64 -mattr=+zbb,+short-forward-branch-opt | \
+; RUN: FileCheck %s --check-prefixes=RV64I-SFB-ZBB
+
+define i32 @select_example_smax(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_smax:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: bge a3, a0, .LBB0_3
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: beqz a2, .LBB0_4
+; RV32I-NOZBB-NEXT: .LBB0_2: # %entry
+; RV32I-NOZBB-NEXT: ret
+; RV32I-NOZBB-NEXT: .LBB0_3: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a3
+; RV32I-NOZBB-NEXT: bnez a2, .LBB0_2
+; RV32I-NOZBB-NEXT: .LBB0_4: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a1
+; RV32I-NOZBB-NEXT: ret
+;
+; RV64I-NOZBB-LABEL: select_example_smax:
+; RV64I-NOZBB: # %bb.0: # %entry
+; RV64I-NOZBB-NEXT: sext.w a0, a0
+; RV64I-NOZBB-NEXT: sext.w a3, a3
+; RV64I-NOZBB-NEXT: bge a3, a0, .LBB0_3
+; RV64I-NOZBB-NEXT: # %bb.1: # %entry
+; RV64I-NOZBB-NEXT: beqz a2, .LBB0_4
+; RV64I-NOZBB-NEXT: .LBB0_2: # %entry
+; RV64I-NOZBB-NEXT: ret
+; RV64I-NOZBB-NEXT: .LBB0_3: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a3
+; RV64I-NOZBB-NEXT: bnez a2, .LBB0_2
+; RV64I-NOZBB-NEXT: .LBB0_4: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a1
+; RV64I-NOZBB-NEXT: ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smax:
+; RV32I-SFB-ZBB: # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT: beqz a2, .LBB0_2
+; RV32I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT: max a1, a0, a3
+; RV32I-SFB-ZBB-NEXT: .LBB0_2: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a1
+; RV32I-SFB-ZBB-NEXT: ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smax:
+; RV64I-SFB-ZBB: # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT: sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT: sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT: beqz a2, .LBB0_2
+; RV64I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT: max a1, a0, a3
+; RV64I-SFB-ZBB-NEXT: .LBB0_2: # %entry
+; RV64I-SFB-ZBB-NEXT: mv a0, a1
+; RV64I-SFB-ZBB-NEXT: ret
+entry:
+ %res = call i32 @llvm.smax.i32(i32 %a, i32 %y)
+ %sel = select i1 %x, i32 %res, i32 %b
+ ret i32 %sel
+}
+
+define i32 @select_example_smin(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_smin:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: bge a0, a3, .LBB1_3
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: beqz a2, .LBB1_4
+; RV32I-NOZBB-NEXT: .LBB1_2: # %entry
+; RV32I-NOZBB-NEXT: ret
+; RV32I-NOZBB-NEXT: .LBB1_3: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a3
+; RV32I-NOZBB-NEXT: bnez a2, .LBB1_2
+; RV32I-NOZBB-NEXT: .LBB1_4: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a1
+; RV32I-NOZBB-NEXT: ret
+;
+; RV64I-NOZBB-LABEL: select_example_smin:
+; RV64I-NOZBB: # %bb.0: # %entry
+; RV64I-NOZBB-NEXT: sext.w a3, a3
+; RV64I-NOZBB-NEXT: sext.w a0, a0
+; RV64I-NOZBB-NEXT: bge a0, a3, .LBB1_3
+; RV64I-NOZBB-NEXT: # %bb.1: # %entry
+; RV64I-NOZBB-NEXT: beqz a2, .LBB1_4
+; RV64I-NOZBB-NEXT: .LBB1_2: # %entry
+; RV64I-NOZBB-NEXT: ret
+; RV64I-NOZBB-NEXT: .LBB1_3: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a3
+; RV64I-NOZBB-NEXT: bnez a2, .LBB1_2
+; RV64I-NOZBB-NEXT: .LBB1_4: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a1
+; RV64I-NOZBB-NEXT: ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smin:
+; RV32I-SFB-ZBB: # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT: beqz a2, .LBB1_2
+; RV32I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT: min a1, a0, a3
+; RV32I-SFB-ZBB-NEXT: .LBB1_2: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a1
+; RV32I-SFB-ZBB-NEXT: ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smin:
+; RV64I-SFB-ZBB: # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT: sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT: sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT: beqz a2, .LBB1_2
+; RV64I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT: min a1, a0, a3
+; RV64I-SFB-ZBB-NEXT: .LBB1_2: # %entry
+; RV64I-SFB-ZBB-NEXT: mv a0, a1
+; RV64I-SFB-ZBB-NEXT: ret
+entry:
+ %res = call i32 @llvm.smin.i32(i32 %a, i32 %y)
+ %sel = select i1 %x, i32 %res, i32 %b
+ ret i32 %sel
+}
+
+define i32 @select_example_umax(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_umax:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: bgeu a3, a0, .LBB2_3
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: beqz a2, .LBB2_4
+; RV32I-NOZBB-NEXT: .LBB2_2: # %entry
+; RV32I-NOZBB-NEXT: ret
+; RV32I-NOZBB-NEXT: .LBB2_3: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a3
+; RV32I-NOZBB-NEXT: bnez a2, .LBB2_2
+; RV32I-NOZBB-NEXT: .LBB2_4: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a1
+; RV32I-NOZBB-NEXT: ret
+;
+; RV64I-NOZBB-LABEL: select_example_umax:
+; RV64I-NOZBB: # %bb.0: # %entry
+; RV64I-NOZBB-NEXT: sext.w a0, a0
+; RV64I-NOZBB-NEXT: sext.w a3, a3
+; RV64I-NOZBB-NEXT: bgeu a3, a0, .LBB2_3
+; RV64I-NOZBB-NEXT: # %bb.1: # %entry
+; RV64I-NOZBB-NEXT: beqz a2, .LBB2_4
+; RV64I-NOZBB-NEXT: .LBB2_2: # %entry
+; RV64I-NOZBB-NEXT: ret
+; RV64I-NOZBB-NEXT: .LBB2_3: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a3
+; RV64I-NOZBB-NEXT: bnez a2, .LBB2_2
+; RV64I-NOZBB-NEXT: .LBB2_4: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a1
+; RV64I-NOZBB-NEXT: ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_umax:
+; RV32I-SFB-ZBB: # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT: beqz a2, .LBB2_2
+; RV32I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT: maxu a1, a0, a3
+; RV32I-SFB-ZBB-NEXT: .LBB2_2: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a1
+; RV32I-SFB-ZBB-NEXT: ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_umax:
+; RV64I-SFB-ZBB: # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT: sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT: sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT: beqz a2, .LBB2_2
+; RV64I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT: maxu a1, a0, a3
+; RV64I-SFB-ZBB-NEXT: .LBB2_2: # %entry
+; RV64I-SFB-ZBB-NEXT: mv a0, a1
+; RV64I-SFB-ZBB-NEXT: ret
+entry:
+ %res = call i32 @llvm.umax.i32(i32 %a, i32 %y)
+ %sel = select i1 %x, i32 %res, i32 %b
+ ret i32 %sel
+}
+
+define i32 @select_example_umin(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_umin:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: bgeu a0, a3, .LBB3_3
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: beqz a2, .LBB3_4
+; RV32I-NOZBB-NEXT: .LBB3_2: # %entry
+; RV32I-NOZBB-NEXT: ret
+; RV32I-NOZBB-NEXT: .LBB3_3: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a3
+; RV32I-NOZBB-NEXT: bnez a2, .LBB3_2
+; RV32I-NOZBB-NEXT: .LBB3_4: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a1
+; RV32I-NOZBB-NEXT: ret
+;
+; RV64I-NOZBB-LABEL: select_example_umin:
+; RV64I-NOZBB: # %bb.0: # %entry
+; RV64I-NOZBB-NEXT: sext.w a3, a3
+; RV64I-NOZBB-NEXT: sext.w a0, a0
+; RV64I-NOZBB-NEXT: bgeu a0, a3, .LBB3_3
+; RV64I-NOZBB-NEXT: # %bb.1: # %entry
+; RV64I-NOZBB-NEXT: beqz a2, .LBB3_4
+; RV64I-NOZBB-NEXT: .LBB3_2: # %entry
+; RV64I-NOZBB-NEXT: ret
+; RV64I-NOZBB-NEXT: .LBB3_3: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a3
+; RV64I-NOZBB-NEXT: bnez a2, .LBB3_2
+; RV64I-NOZBB-NEXT: .LBB3_4: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a1
+; RV64I-NOZBB-NEXT: ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_umin:
+; RV32I-SFB-ZBB: # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT: beqz a2, .LBB3_2
+; RV32I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT: minu a1, a0, a3
+; RV32I-SFB-ZBB-NEXT: .LBB3_2: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a1
+; RV32I-SFB-ZBB-NEXT: ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_umin:
+; RV64I-SFB-ZBB: # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT: sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT: sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT: beqz a2, .LBB3_2
+; RV64I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT: minu a1, a0, a3
+; RV64I-SFB-ZBB-NEXT: .LBB3_2: # %entry
+; RV64I-SFB-ZBB-NEXT: mv a0, a1
+; RV64I-SFB-ZBB-NEXT: ret
+entry:
+ %res = call i32 @llvm.umin.i32(i32 %a, i32 %y)
+ %sel = select i1 %x, i32 %res, i32 %b
+ ret i32 %sel
+}
+
+define i64 @select_example_smax_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_smax_1:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: beq a1, a6, .LBB4_2
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: slt a7, a6, a1
+; RV32I-NOZBB-NEXT: beqz a7, .LBB4_3
+; RV32I-NOZBB-NEXT: j .LBB4_4
+; RV32I-NOZBB-NEXT: .LBB4_2:
+; RV32I-NOZBB-NEXT: sltu a7, a5, a0
+; RV32I-NOZBB-NEXT: bnez a7, .LBB4_4
+; RV32I-NOZBB-NEXT: .LBB4_3: # %entry
+; RV32I-NOZBB-NEXT: mv a1, a6
+; RV32I-NOZBB-NEXT: mv a0, a5
+; RV32I-NOZBB-NEXT: .LBB4_4: # %entry
+; RV32I-NOZBB-NEXT: beqz a4, .LBB4_6
+; RV32I-NOZBB-NEXT: # %bb.5: # %entry
+; RV32I-NOZBB-NEXT: ret
+; RV32I-NOZBB-NEXT: .LBB4_6: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a2
+; RV32I-NOZBB-NEXT: mv a1, a3
+; RV32I-NOZBB-NEXT: ret
+;
+; RV64I-NOZBB-LABEL: select_example_smax_1:
+; RV64I-NOZBB: # %bb.0: # %entry
+; RV64I-NOZBB-NEXT: bge a3, a0, .LBB4_3
+; RV64I-NOZBB-NEXT: # %bb.1: # %entry
+; RV64I-NOZBB-NEXT: beqz a2, .LBB4_4
+; RV64I-NOZBB-NEXT: .LBB4_2: # %entry
+; RV64I-NOZBB-NEXT: ret
+; RV64I-NOZBB-NEXT: .LBB4_3: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a3
+; RV64I-NOZBB-NEXT: bnez a2, .LBB4_2
+; RV64I-NOZBB-NEXT: .LBB4_4: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a1
+; RV64I-NOZBB-NEXT: ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smax_1:
+; RV32I-SFB-ZBB: # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT: sltu a7, a5, a0
+; RV32I-SFB-ZBB-NEXT: slt t0, a6, a1
+; RV32I-SFB-ZBB-NEXT: bne a1, a6, .LBB4_2
+; RV32I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT: mv t0, a7
+; RV32I-SFB-ZBB-NEXT: .LBB4_2: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez t0, .LBB4_4
+; RV32I-SFB-ZBB-NEXT: # %bb.3: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a1, a6
+; RV32I-SFB-ZBB-NEXT: .LBB4_4: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez t0, .LBB4_6
+; RV32I-SFB-ZBB-NEXT: # %bb.5: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a5
+; RV32I-SFB-ZBB-NEXT: .LBB4_6: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez a4, .LBB4_8
+; RV32I-SFB-ZBB-NEXT: # %bb.7: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a2
+; RV32I-SFB-ZBB-NEXT: .LBB4_8: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez a4, .LBB4_10
+; RV32I-SFB-ZBB-NEXT: # %bb.9: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a1, a3
+; RV32I-SFB-ZBB-NEXT: .LBB4_10: # %entry
+; RV32I-SFB-ZBB-NEXT: ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smax_1:
+; RV64I-SFB-ZBB: # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT: beqz a2, .LBB4_2
+; RV64I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT: max a1, a0, a3
+; RV64I-SFB-ZBB-NEXT: .LBB4_2: # %entry
+; RV64I-SFB-ZBB-NEXT: mv a0, a1
+; RV64I-SFB-ZBB-NEXT: ret
+entry:
+ %res = call i64 @llvm.smax.i64(i64 %a, i64 %y)
+ %sel = select i1 %x, i64 %res, i64 %b
+ ret i64 %sel
+}
+
+define i64 @select_example_smin_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_smin_1:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: beq a1, a6, .LBB5_2
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: slt a7, a1, a6
+; RV32I-NOZBB-NEXT: beqz a7, .LBB5_3
+; RV32I-NOZBB-NEXT: j .LBB5_4
+; RV32I-NOZBB-NEXT: .LBB5_2:
+; RV32I-NOZBB-NEXT: sltu a7, a0, a5
+; RV32I-NOZBB-NEXT: bnez a7, .LBB5_4
+; RV32I-NOZBB-NEXT: .LBB5_3: # %entry
+; RV32I-NOZBB-NEXT: mv a1, a6
+; RV32I-NOZBB-NEXT: mv a0, a5
+; RV32I-NOZBB-NEXT: .LBB5_4: # %entry
+; RV32I-NOZBB-NEXT: beqz a4, .LBB5_6
+; RV32I-NOZBB-NEXT: # %bb.5: # %entry
+; RV32I-NOZBB-NEXT: ret
+; RV32I-NOZBB-NEXT: .LBB5_6: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a2
+; RV32I-NOZBB-NEXT: mv a1, a3
+; RV32I-NOZBB-NEXT: ret
+;
+; RV64I-NOZBB-LABEL: select_example_smin_1:
+; RV64I-NOZBB: # %bb.0: # %entry
+; RV64I-NOZBB-NEXT: bge a0, a3, .LBB5_3
+; RV64I-NOZBB-NEXT: # %bb.1: # %entry
+; RV64I-NOZBB-NEXT: beqz a2, .LBB5_4
+; RV64I-NOZBB-NEXT: .LBB5_2: # %entry
+; RV64I-NOZBB-NEXT: ret
+; RV64I-NOZBB-NEXT: .LBB5_3: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a3
+; RV64I-NOZBB-NEXT: bnez a2, .LBB5_2
+; RV64I-NOZBB-NEXT: .LBB5_4: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a1
+; RV64I-NOZBB-NEXT: ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smin_1:
+; RV32I-SFB-ZBB: # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT: sltu a7, a0, a5
+; RV32I-SFB-ZBB-NEXT: slt t0, a1, a6
+; RV32I-SFB-ZBB-NEXT: bne a1, a6, .LBB5_2
+; RV32I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT: mv t0, a7
+; RV32I-SFB-ZBB-NEXT: .LBB5_2: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez t0, .LBB5_4
+; RV32I-SFB-ZBB-NEXT: # %bb.3: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a1, a6
+; RV32I-SFB-ZBB-NEXT: .LBB5_4: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez t0, .LBB5_6
+; RV32I-SFB-ZBB-NEXT: # %bb.5: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a5
+; RV32I-SFB-ZBB-NEXT: .LBB5_6: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez a4, .LBB5_8
+; RV32I-SFB-ZBB-NEXT: # %bb.7: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a2
+; RV32I-SFB-ZBB-NEXT: .LBB5_8: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez a4, .LBB5_10
+; RV32I-SFB-ZBB-NEXT: # %bb.9: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a1, a3
+; RV32I-SFB-ZBB-NEXT: .LBB5_10: # %entry
+; RV32I-SFB-ZBB-NEXT: ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smin_1:
+; RV64I-SFB-ZBB: # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT: beqz a2, .LBB5_2
+; RV64I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT: min a1, a0, a3
+; RV64I-SFB-ZBB-NEXT: .LBB5_2: # %entry
+; RV64I-SFB-ZBB-NEXT: mv a0, a1
+; RV64I-SFB-ZBB-NEXT: ret
+entry:
+ %res = call i64 @llvm.smin.i64(i64 %a, i64 %y)
+ %sel = select i1 %x, i64 %res, i64 %b
+ ret i64 %sel
+}
+
+define i64 @select_example_umax_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_umax_1:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: beq a1, a6, .LBB6_2
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: sltu a7, a6, a1
+; RV32I-NOZBB-NEXT: beqz a7, .LBB6_3
+; RV32I-NOZBB-NEXT: j .LBB6_4
+; RV32I-NOZBB-NEXT: .LBB6_2:
+; RV32I-NOZBB-NEXT: sltu a7, a5, a0
+; RV32I-NOZBB-NEXT: bnez a7, .LBB6_4
+; RV32I-NOZBB-NEXT: .LBB6_3: # %entry
+; RV32I-NOZBB-NEXT: mv a1, a6
+; RV32I-NOZBB-NEXT: mv a0, a5
+; RV32I-NOZBB-NEXT: .LBB6_4: # %entry
+; RV32I-NOZBB-NEXT: beqz a4, .LBB6_6
+; RV32I-NOZBB-NEXT: # %bb.5: # %entry
+; RV32I-NOZBB-NEXT: ret
+; RV32I-NOZBB-NEXT: .LBB6_6: # %entry
+; RV32I-NOZBB-NEXT: mv a0, a2
+; RV32I-NOZBB-NEXT: mv a1, a3
+; RV32I-NOZBB-NEXT: ret
+;
+; RV64I-NOZBB-LABEL: select_example_umax_1:
+; RV64I-NOZBB: # %bb.0: # %entry
+; RV64I-NOZBB-NEXT: bgeu a3, a0, .LBB6_3
+; RV64I-NOZBB-NEXT: # %bb.1: # %entry
+; RV64I-NOZBB-NEXT: beqz a2, .LBB6_4
+; RV64I-NOZBB-NEXT: .LBB6_2: # %entry
+; RV64I-NOZBB-NEXT: ret
+; RV64I-NOZBB-NEXT: .LBB6_3: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a3
+; RV64I-NOZBB-NEXT: bnez a2, .LBB6_2
+; RV64I-NOZBB-NEXT: .LBB6_4: # %entry
+; RV64I-NOZBB-NEXT: mv a0, a1
+; RV64I-NOZBB-NEXT: ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_umax_1:
+; RV32I-SFB-ZBB: # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT: sltu a7, a5, a0
+; RV32I-SFB-ZBB-NEXT: sltu t0, a6, a1
+; RV32I-SFB-ZBB-NEXT: bne a1, a6, .LBB6_2
+; RV32I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT: mv t0, a7
+; RV32I-SFB-ZBB-NEXT: .LBB6_2: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez t0, .LBB6_4
+; RV32I-SFB-ZBB-NEXT: # %bb.3: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a1, a6
+; RV32I-SFB-ZBB-NEXT: .LBB6_4: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez t0, .LBB6_6
+; RV32I-SFB-ZBB-NEXT: # %bb.5: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a5
+; RV32I-SFB-ZBB-NEXT: .LBB6_6: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez a4, .LBB6_8
+; RV32I-SFB-ZBB-NEXT: # %bb.7: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a0, a2
+; RV32I-SFB-ZBB-NEXT: .LBB6_8: # %entry
+; RV32I-SFB-ZBB-NEXT: bnez a4, .LBB6_10
+; RV32I-SFB-ZBB-NEXT: # %bb.9: # %entry
+; RV32I-SFB-ZBB-NEXT: mv a1, a3
+; RV32I-SFB-ZBB-NEXT: .LBB6_10: # %entry
+; RV32I-SFB-ZBB-NEXT: ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_umax_1:
+; RV64I-SFB-ZBB: # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT: beqz a2, .LBB6_2
+; RV64I-SFB-ZBB-NEXT: # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT: maxu a1, a0, a3
+; RV64I-SFB-ZBB-NEXT: .LBB6_2: # %entry
+; RV64I-SFB-ZBB-NEXT: mv a0, a1
+; RV64I-SFB-ZBB-NEXT: ret
+entry:
+ %res = call i64 @llvm.umax.i64(i64 %a, i64 %y)
+ %sel = select i1 %x, i64 %res, i64 %b
+ ret i64 %sel
+}
+
+define i64 @select_example_umin_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_umin_1:
+; RV32I-NOZBB: # %bb.0: # %entry
+; RV32I-NOZBB-NEXT: beq a1, a6, .LBB7_2
+; RV32I-NOZBB-NEXT: # %bb.1: # %entry
+; RV32I-NOZBB-NEXT: sltu a7, a1, a6
+; RV32I-NOZBB-NEXT: beqz a7, .LBB7_3
+; RV32I-NOZBB-NEXT: j .LBB7_4
+; RV32I-NOZBB-NEXT: .LBB7_2:
+; RV32I-NOZBB-NEXT: sltu a7, a0, a5
+; RV32I-NOZBB-NEXT: bnez a7, .LBB7_4
+; RV32I-NOZBB-NEXT:...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Change-Id: I14c0493b53c643c96ee5cb0ce3a8531f9d33e207
| case RISCV::PseudoCCAND: NewOpc = RISCV::AND; break; | ||
| case RISCV::PseudoCCOR: NewOpc = RISCV::OR; break; | ||
| case RISCV::PseudoCCXOR: NewOpc = RISCV::XOR; break; | ||
| case RISCV::PseudoCCMAX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a consistent coding style. Were the changes here made by clang-format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the changes were made by clang-format. When I had the changes consistent with the coding style here, Then PR check of clang-format failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a clang-format off/on directive around the switch.
| case RISCV::AND: return RISCV::PseudoCCAND; break; | ||
| case RISCV::OR: return RISCV::PseudoCCOR; break; | ||
| case RISCV::XOR: return RISCV::PseudoCCXOR; break; | ||
| case RISCV::MAX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Were the changes made by clang-format?
| ; RUN: llc < %s -mtriple=riscv32 -mattr=+zbb,+short-forward-branch-opt | \ | ||
| ; RUN: FileCheck %s --check-prefixes=RV32I-SFB-ZBB | ||
| ; RUN: llc < %s -mtriple=riscv64 -mattr=+zbb,+short-forward-branch-opt | \ | ||
| ; RUN: FileCheck %s --check-prefixes=RV64I-SFB-ZBB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add run lines with just zbb enabled so that we can see the difference in code when SFB is enabled?
|
SiFive cores do not support short forward branch for MIN/MAX. |
| case RISCV::AND: return RISCV::PseudoCCAND; break; | ||
| case RISCV::OR: return RISCV::PseudoCCOR; break; | ||
| case RISCV::XOR: return RISCV::PseudoCCXOR; break; | ||
| case RISCV::MAX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really shouldn't have an unreachable break;. I guess that's a mistake I made originally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patch to remove them #164481
We did it this way to avoid introducing tune features that depend on other tune features, but I see there are more instruction differences than I expected. I hoped that SiFive cores might not have Zbb when they don't do this fusion, but I should have checked closer. Our plan is to add another feature to enable these SFB cases, which would be required. I was originally thinking of naming it The hypothetical The idea behind this naming scheme is it should be obvious for extending to other extensions, as I think we will also want it for a What are your thoughts? I've asked Harsh to prepare this patch update (as well as other updates), but it would be good to hear you like this direction before we upload the next version. |
I'm not sure what a good name is. The instructions that aren't supported are the ones that are only available on PipeB in RISCVSchedSiFive7.td. That includes div/rem, mul, ctz, ctz, ctpop, rotate, shXadd, orc.b, bset(i), bclr(i), binv(i). Maybe we should rename the existing flag to TuneShortForwardBranchOptIALU? IALU being the scheduler class that covers the supported instructions? |
No description provided.