Skip to content

Conversation

@hchandel
Copy link
Contributor

No description provided.

Change-Id: I7585f98422bf4b101fd44b1b4d6bc8584ca8cb53
Change-Id: Ibf73c7c9b42a4942a4baa18818cf98cf2916f199
Change-Id: I5a2deafae906b518f3379b2c4ba625cf0a76df79
@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2025

@llvm/pr-subscribers-backend-risc-v

Author: quic_hchandel (hchandel)

Changes

Patch is 22.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164394.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp (+8)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+4)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td (+4)
  • (added) llvm/test/CodeGen/RISCV/short-forward-branch-opt-min-max.ll (+539)
diff --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
index 410561855e181..567a8da50a1db 100644
--- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
@@ -127,6 +127,10 @@ bool RISCVExpandPseudo::expandMI(MachineBasicBlock &MBB,
   case RISCV::PseudoCCAND:
   case RISCV::PseudoCCOR:
   case RISCV::PseudoCCXOR:
+  case RISCV::PseudoCCMAX:
+  case RISCV::PseudoCCMAXU:
+  case RISCV::PseudoCCMIN:
+  case RISCV::PseudoCCMINU:
   case RISCV::PseudoCCADDW:
   case RISCV::PseudoCCSUBW:
   case RISCV::PseudoCCSLL:
@@ -228,6 +232,10 @@ bool RISCVExpandPseudo::expandCCOp(MachineBasicBlock &MBB,
     case RISCV::PseudoCCAND:   NewOpc = RISCV::AND;   break;
     case RISCV::PseudoCCOR:    NewOpc = RISCV::OR;    break;
     case RISCV::PseudoCCXOR:   NewOpc = RISCV::XOR;   break;
+    case RISCV::PseudoCCMAX:   NewOpc = RISCV::MAX;   break;
+    case RISCV::PseudoCCMIN:   NewOpc = RISCV::MIN;   break;
+    case RISCV::PseudoCCMAXU:  NewOpc = RISCV::MAXU;  break;
+    case RISCV::PseudoCCMINU:  NewOpc = RISCV::MINU;  break;
     case RISCV::PseudoCCADDI:  NewOpc = RISCV::ADDI;  break;
     case RISCV::PseudoCCSLLI:  NewOpc = RISCV::SLLI;  break;
     case RISCV::PseudoCCSRLI:  NewOpc = RISCV::SRLI;  break;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index ddb53a2ce62b3..435df1e4b91b6 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -1698,6 +1698,10 @@ unsigned getPredicatedOpcode(unsigned Opcode) {
   case RISCV::AND:   return RISCV::PseudoCCAND;   break;
   case RISCV::OR:    return RISCV::PseudoCCOR;    break;
   case RISCV::XOR:   return RISCV::PseudoCCXOR;   break;
+  case RISCV::MAX:   return RISCV::PseudoCCMAX;   break;
+  case RISCV::MAXU:  return RISCV::PseudoCCMAXU;  break;
+  case RISCV::MIN:   return RISCV::PseudoCCMIN;   break;
+  case RISCV::MINU:  return RISCV::PseudoCCMINU;  break;
 
   case RISCV::ADDI:  return RISCV::PseudoCCADDI;  break;
   case RISCV::SLLI:  return RISCV::PseudoCCSLLI;  break;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td b/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td
index 0114fbdc56302..5a67a5aaba293 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td
@@ -106,6 +106,10 @@ def PseudoCCSRA : SFBALU_rr;
 def PseudoCCAND : SFBALU_rr;
 def PseudoCCOR  : SFBALU_rr;
 def PseudoCCXOR : SFBALU_rr;
+def PseudoCCMAX : SFBALU_rr;
+def PseudoCCMIN : SFBALU_rr;
+def PseudoCCMAXU : SFBALU_rr;
+def PseudoCCMINU : SFBALU_rr;
 
 def PseudoCCADDI : SFBALU_ri;
 def PseudoCCANDI : SFBALU_ri;
diff --git a/llvm/test/CodeGen/RISCV/short-forward-branch-opt-min-max.ll b/llvm/test/CodeGen/RISCV/short-forward-branch-opt-min-max.ll
new file mode 100644
index 0000000000000..9fa4e350aced9
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/short-forward-branch-opt-min-max.ll
@@ -0,0 +1,539 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc < %s -mtriple=riscv32 | FileCheck %s --check-prefixes=RV32I-NOZBB
+; RUN: llc < %s -mtriple=riscv64 | FileCheck %s --check-prefixes=RV64I-NOZBB
+; RUN: llc < %s -mtriple=riscv32 -mattr=+zbb,+short-forward-branch-opt | \
+; RUN:   FileCheck %s --check-prefixes=RV32I-SFB-ZBB
+; RUN: llc < %s -mtriple=riscv64 -mattr=+zbb,+short-forward-branch-opt | \
+; RUN:   FileCheck %s --check-prefixes=RV64I-SFB-ZBB
+
+define i32 @select_example_smax(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_smax:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    bge a3, a0, .LBB0_3
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    beqz a2, .LBB0_4
+; RV32I-NOZBB-NEXT:  .LBB0_2: # %entry
+; RV32I-NOZBB-NEXT:    ret
+; RV32I-NOZBB-NEXT:  .LBB0_3: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a3
+; RV32I-NOZBB-NEXT:    bnez a2, .LBB0_2
+; RV32I-NOZBB-NEXT:  .LBB0_4: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a1
+; RV32I-NOZBB-NEXT:    ret
+;
+; RV64I-NOZBB-LABEL: select_example_smax:
+; RV64I-NOZBB:       # %bb.0: # %entry
+; RV64I-NOZBB-NEXT:    sext.w a0, a0
+; RV64I-NOZBB-NEXT:    sext.w a3, a3
+; RV64I-NOZBB-NEXT:    bge a3, a0, .LBB0_3
+; RV64I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV64I-NOZBB-NEXT:    beqz a2, .LBB0_4
+; RV64I-NOZBB-NEXT:  .LBB0_2: # %entry
+; RV64I-NOZBB-NEXT:    ret
+; RV64I-NOZBB-NEXT:  .LBB0_3: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a3
+; RV64I-NOZBB-NEXT:    bnez a2, .LBB0_2
+; RV64I-NOZBB-NEXT:  .LBB0_4: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a1
+; RV64I-NOZBB-NEXT:    ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smax:
+; RV32I-SFB-ZBB:       # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT:    beqz a2, .LBB0_2
+; RV32I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT:    max a1, a0, a3
+; RV32I-SFB-ZBB-NEXT:  .LBB0_2: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a1
+; RV32I-SFB-ZBB-NEXT:    ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smax:
+; RV64I-SFB-ZBB:       # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT:    sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT:    sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT:    beqz a2, .LBB0_2
+; RV64I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT:    max a1, a0, a3
+; RV64I-SFB-ZBB-NEXT:  .LBB0_2: # %entry
+; RV64I-SFB-ZBB-NEXT:    mv a0, a1
+; RV64I-SFB-ZBB-NEXT:    ret
+entry:
+  %res = call i32 @llvm.smax.i32(i32 %a, i32 %y)
+  %sel = select i1 %x, i32 %res, i32 %b
+  ret i32 %sel
+}
+
+define i32 @select_example_smin(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_smin:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    bge a0, a3, .LBB1_3
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    beqz a2, .LBB1_4
+; RV32I-NOZBB-NEXT:  .LBB1_2: # %entry
+; RV32I-NOZBB-NEXT:    ret
+; RV32I-NOZBB-NEXT:  .LBB1_3: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a3
+; RV32I-NOZBB-NEXT:    bnez a2, .LBB1_2
+; RV32I-NOZBB-NEXT:  .LBB1_4: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a1
+; RV32I-NOZBB-NEXT:    ret
+;
+; RV64I-NOZBB-LABEL: select_example_smin:
+; RV64I-NOZBB:       # %bb.0: # %entry
+; RV64I-NOZBB-NEXT:    sext.w a3, a3
+; RV64I-NOZBB-NEXT:    sext.w a0, a0
+; RV64I-NOZBB-NEXT:    bge a0, a3, .LBB1_3
+; RV64I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV64I-NOZBB-NEXT:    beqz a2, .LBB1_4
+; RV64I-NOZBB-NEXT:  .LBB1_2: # %entry
+; RV64I-NOZBB-NEXT:    ret
+; RV64I-NOZBB-NEXT:  .LBB1_3: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a3
+; RV64I-NOZBB-NEXT:    bnez a2, .LBB1_2
+; RV64I-NOZBB-NEXT:  .LBB1_4: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a1
+; RV64I-NOZBB-NEXT:    ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smin:
+; RV32I-SFB-ZBB:       # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT:    beqz a2, .LBB1_2
+; RV32I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT:    min a1, a0, a3
+; RV32I-SFB-ZBB-NEXT:  .LBB1_2: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a1
+; RV32I-SFB-ZBB-NEXT:    ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smin:
+; RV64I-SFB-ZBB:       # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT:    sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT:    sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT:    beqz a2, .LBB1_2
+; RV64I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT:    min a1, a0, a3
+; RV64I-SFB-ZBB-NEXT:  .LBB1_2: # %entry
+; RV64I-SFB-ZBB-NEXT:    mv a0, a1
+; RV64I-SFB-ZBB-NEXT:    ret
+entry:
+  %res = call i32 @llvm.smin.i32(i32 %a, i32 %y)
+  %sel = select i1 %x, i32 %res, i32 %b
+  ret i32 %sel
+}
+
+define i32 @select_example_umax(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_umax:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    bgeu a3, a0, .LBB2_3
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    beqz a2, .LBB2_4
+; RV32I-NOZBB-NEXT:  .LBB2_2: # %entry
+; RV32I-NOZBB-NEXT:    ret
+; RV32I-NOZBB-NEXT:  .LBB2_3: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a3
+; RV32I-NOZBB-NEXT:    bnez a2, .LBB2_2
+; RV32I-NOZBB-NEXT:  .LBB2_4: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a1
+; RV32I-NOZBB-NEXT:    ret
+;
+; RV64I-NOZBB-LABEL: select_example_umax:
+; RV64I-NOZBB:       # %bb.0: # %entry
+; RV64I-NOZBB-NEXT:    sext.w a0, a0
+; RV64I-NOZBB-NEXT:    sext.w a3, a3
+; RV64I-NOZBB-NEXT:    bgeu a3, a0, .LBB2_3
+; RV64I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV64I-NOZBB-NEXT:    beqz a2, .LBB2_4
+; RV64I-NOZBB-NEXT:  .LBB2_2: # %entry
+; RV64I-NOZBB-NEXT:    ret
+; RV64I-NOZBB-NEXT:  .LBB2_3: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a3
+; RV64I-NOZBB-NEXT:    bnez a2, .LBB2_2
+; RV64I-NOZBB-NEXT:  .LBB2_4: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a1
+; RV64I-NOZBB-NEXT:    ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_umax:
+; RV32I-SFB-ZBB:       # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT:    beqz a2, .LBB2_2
+; RV32I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT:    maxu a1, a0, a3
+; RV32I-SFB-ZBB-NEXT:  .LBB2_2: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a1
+; RV32I-SFB-ZBB-NEXT:    ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_umax:
+; RV64I-SFB-ZBB:       # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT:    sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT:    sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT:    beqz a2, .LBB2_2
+; RV64I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT:    maxu a1, a0, a3
+; RV64I-SFB-ZBB-NEXT:  .LBB2_2: # %entry
+; RV64I-SFB-ZBB-NEXT:    mv a0, a1
+; RV64I-SFB-ZBB-NEXT:    ret
+entry:
+  %res = call i32 @llvm.umax.i32(i32 %a, i32 %y)
+  %sel = select i1 %x, i32 %res, i32 %b
+  ret i32 %sel
+}
+
+define i32 @select_example_umin(i32 %a, i32 %b, i1 zeroext %x, i32 %y) {
+; RV32I-NOZBB-LABEL: select_example_umin:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    bgeu a0, a3, .LBB3_3
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    beqz a2, .LBB3_4
+; RV32I-NOZBB-NEXT:  .LBB3_2: # %entry
+; RV32I-NOZBB-NEXT:    ret
+; RV32I-NOZBB-NEXT:  .LBB3_3: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a3
+; RV32I-NOZBB-NEXT:    bnez a2, .LBB3_2
+; RV32I-NOZBB-NEXT:  .LBB3_4: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a1
+; RV32I-NOZBB-NEXT:    ret
+;
+; RV64I-NOZBB-LABEL: select_example_umin:
+; RV64I-NOZBB:       # %bb.0: # %entry
+; RV64I-NOZBB-NEXT:    sext.w a3, a3
+; RV64I-NOZBB-NEXT:    sext.w a0, a0
+; RV64I-NOZBB-NEXT:    bgeu a0, a3, .LBB3_3
+; RV64I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV64I-NOZBB-NEXT:    beqz a2, .LBB3_4
+; RV64I-NOZBB-NEXT:  .LBB3_2: # %entry
+; RV64I-NOZBB-NEXT:    ret
+; RV64I-NOZBB-NEXT:  .LBB3_3: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a3
+; RV64I-NOZBB-NEXT:    bnez a2, .LBB3_2
+; RV64I-NOZBB-NEXT:  .LBB3_4: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a1
+; RV64I-NOZBB-NEXT:    ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_umin:
+; RV32I-SFB-ZBB:       # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT:    beqz a2, .LBB3_2
+; RV32I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT:    minu a1, a0, a3
+; RV32I-SFB-ZBB-NEXT:  .LBB3_2: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a1
+; RV32I-SFB-ZBB-NEXT:    ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_umin:
+; RV64I-SFB-ZBB:       # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT:    sext.w a3, a3
+; RV64I-SFB-ZBB-NEXT:    sext.w a0, a0
+; RV64I-SFB-ZBB-NEXT:    beqz a2, .LBB3_2
+; RV64I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT:    minu a1, a0, a3
+; RV64I-SFB-ZBB-NEXT:  .LBB3_2: # %entry
+; RV64I-SFB-ZBB-NEXT:    mv a0, a1
+; RV64I-SFB-ZBB-NEXT:    ret
+entry:
+  %res = call i32 @llvm.umin.i32(i32 %a, i32 %y)
+  %sel = select i1 %x, i32 %res, i32 %b
+  ret i32 %sel
+}
+
+define i64 @select_example_smax_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_smax_1:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    beq a1, a6, .LBB4_2
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    slt a7, a6, a1
+; RV32I-NOZBB-NEXT:    beqz a7, .LBB4_3
+; RV32I-NOZBB-NEXT:    j .LBB4_4
+; RV32I-NOZBB-NEXT:  .LBB4_2:
+; RV32I-NOZBB-NEXT:    sltu a7, a5, a0
+; RV32I-NOZBB-NEXT:    bnez a7, .LBB4_4
+; RV32I-NOZBB-NEXT:  .LBB4_3: # %entry
+; RV32I-NOZBB-NEXT:    mv a1, a6
+; RV32I-NOZBB-NEXT:    mv a0, a5
+; RV32I-NOZBB-NEXT:  .LBB4_4: # %entry
+; RV32I-NOZBB-NEXT:    beqz a4, .LBB4_6
+; RV32I-NOZBB-NEXT:  # %bb.5: # %entry
+; RV32I-NOZBB-NEXT:    ret
+; RV32I-NOZBB-NEXT:  .LBB4_6: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a2
+; RV32I-NOZBB-NEXT:    mv a1, a3
+; RV32I-NOZBB-NEXT:    ret
+;
+; RV64I-NOZBB-LABEL: select_example_smax_1:
+; RV64I-NOZBB:       # %bb.0: # %entry
+; RV64I-NOZBB-NEXT:    bge a3, a0, .LBB4_3
+; RV64I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV64I-NOZBB-NEXT:    beqz a2, .LBB4_4
+; RV64I-NOZBB-NEXT:  .LBB4_2: # %entry
+; RV64I-NOZBB-NEXT:    ret
+; RV64I-NOZBB-NEXT:  .LBB4_3: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a3
+; RV64I-NOZBB-NEXT:    bnez a2, .LBB4_2
+; RV64I-NOZBB-NEXT:  .LBB4_4: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a1
+; RV64I-NOZBB-NEXT:    ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smax_1:
+; RV32I-SFB-ZBB:       # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT:    sltu a7, a5, a0
+; RV32I-SFB-ZBB-NEXT:    slt t0, a6, a1
+; RV32I-SFB-ZBB-NEXT:    bne a1, a6, .LBB4_2
+; RV32I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv t0, a7
+; RV32I-SFB-ZBB-NEXT:  .LBB4_2: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez t0, .LBB4_4
+; RV32I-SFB-ZBB-NEXT:  # %bb.3: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a1, a6
+; RV32I-SFB-ZBB-NEXT:  .LBB4_4: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez t0, .LBB4_6
+; RV32I-SFB-ZBB-NEXT:  # %bb.5: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a5
+; RV32I-SFB-ZBB-NEXT:  .LBB4_6: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez a4, .LBB4_8
+; RV32I-SFB-ZBB-NEXT:  # %bb.7: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a2
+; RV32I-SFB-ZBB-NEXT:  .LBB4_8: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez a4, .LBB4_10
+; RV32I-SFB-ZBB-NEXT:  # %bb.9: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a1, a3
+; RV32I-SFB-ZBB-NEXT:  .LBB4_10: # %entry
+; RV32I-SFB-ZBB-NEXT:    ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smax_1:
+; RV64I-SFB-ZBB:       # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT:    beqz a2, .LBB4_2
+; RV64I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT:    max a1, a0, a3
+; RV64I-SFB-ZBB-NEXT:  .LBB4_2: # %entry
+; RV64I-SFB-ZBB-NEXT:    mv a0, a1
+; RV64I-SFB-ZBB-NEXT:    ret
+entry:
+  %res = call i64 @llvm.smax.i64(i64 %a, i64 %y)
+  %sel = select i1 %x, i64 %res, i64 %b
+  ret i64 %sel
+}
+
+define i64 @select_example_smin_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_smin_1:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    beq a1, a6, .LBB5_2
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    slt a7, a1, a6
+; RV32I-NOZBB-NEXT:    beqz a7, .LBB5_3
+; RV32I-NOZBB-NEXT:    j .LBB5_4
+; RV32I-NOZBB-NEXT:  .LBB5_2:
+; RV32I-NOZBB-NEXT:    sltu a7, a0, a5
+; RV32I-NOZBB-NEXT:    bnez a7, .LBB5_4
+; RV32I-NOZBB-NEXT:  .LBB5_3: # %entry
+; RV32I-NOZBB-NEXT:    mv a1, a6
+; RV32I-NOZBB-NEXT:    mv a0, a5
+; RV32I-NOZBB-NEXT:  .LBB5_4: # %entry
+; RV32I-NOZBB-NEXT:    beqz a4, .LBB5_6
+; RV32I-NOZBB-NEXT:  # %bb.5: # %entry
+; RV32I-NOZBB-NEXT:    ret
+; RV32I-NOZBB-NEXT:  .LBB5_6: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a2
+; RV32I-NOZBB-NEXT:    mv a1, a3
+; RV32I-NOZBB-NEXT:    ret
+;
+; RV64I-NOZBB-LABEL: select_example_smin_1:
+; RV64I-NOZBB:       # %bb.0: # %entry
+; RV64I-NOZBB-NEXT:    bge a0, a3, .LBB5_3
+; RV64I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV64I-NOZBB-NEXT:    beqz a2, .LBB5_4
+; RV64I-NOZBB-NEXT:  .LBB5_2: # %entry
+; RV64I-NOZBB-NEXT:    ret
+; RV64I-NOZBB-NEXT:  .LBB5_3: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a3
+; RV64I-NOZBB-NEXT:    bnez a2, .LBB5_2
+; RV64I-NOZBB-NEXT:  .LBB5_4: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a1
+; RV64I-NOZBB-NEXT:    ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_smin_1:
+; RV32I-SFB-ZBB:       # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT:    sltu a7, a0, a5
+; RV32I-SFB-ZBB-NEXT:    slt t0, a1, a6
+; RV32I-SFB-ZBB-NEXT:    bne a1, a6, .LBB5_2
+; RV32I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv t0, a7
+; RV32I-SFB-ZBB-NEXT:  .LBB5_2: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez t0, .LBB5_4
+; RV32I-SFB-ZBB-NEXT:  # %bb.3: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a1, a6
+; RV32I-SFB-ZBB-NEXT:  .LBB5_4: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez t0, .LBB5_6
+; RV32I-SFB-ZBB-NEXT:  # %bb.5: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a5
+; RV32I-SFB-ZBB-NEXT:  .LBB5_6: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez a4, .LBB5_8
+; RV32I-SFB-ZBB-NEXT:  # %bb.7: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a2
+; RV32I-SFB-ZBB-NEXT:  .LBB5_8: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez a4, .LBB5_10
+; RV32I-SFB-ZBB-NEXT:  # %bb.9: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a1, a3
+; RV32I-SFB-ZBB-NEXT:  .LBB5_10: # %entry
+; RV32I-SFB-ZBB-NEXT:    ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_smin_1:
+; RV64I-SFB-ZBB:       # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT:    beqz a2, .LBB5_2
+; RV64I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT:    min a1, a0, a3
+; RV64I-SFB-ZBB-NEXT:  .LBB5_2: # %entry
+; RV64I-SFB-ZBB-NEXT:    mv a0, a1
+; RV64I-SFB-ZBB-NEXT:    ret
+entry:
+  %res = call i64 @llvm.smin.i64(i64 %a, i64 %y)
+  %sel = select i1 %x, i64 %res, i64 %b
+  ret i64 %sel
+}
+
+define i64 @select_example_umax_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_umax_1:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    beq a1, a6, .LBB6_2
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    sltu a7, a6, a1
+; RV32I-NOZBB-NEXT:    beqz a7, .LBB6_3
+; RV32I-NOZBB-NEXT:    j .LBB6_4
+; RV32I-NOZBB-NEXT:  .LBB6_2:
+; RV32I-NOZBB-NEXT:    sltu a7, a5, a0
+; RV32I-NOZBB-NEXT:    bnez a7, .LBB6_4
+; RV32I-NOZBB-NEXT:  .LBB6_3: # %entry
+; RV32I-NOZBB-NEXT:    mv a1, a6
+; RV32I-NOZBB-NEXT:    mv a0, a5
+; RV32I-NOZBB-NEXT:  .LBB6_4: # %entry
+; RV32I-NOZBB-NEXT:    beqz a4, .LBB6_6
+; RV32I-NOZBB-NEXT:  # %bb.5: # %entry
+; RV32I-NOZBB-NEXT:    ret
+; RV32I-NOZBB-NEXT:  .LBB6_6: # %entry
+; RV32I-NOZBB-NEXT:    mv a0, a2
+; RV32I-NOZBB-NEXT:    mv a1, a3
+; RV32I-NOZBB-NEXT:    ret
+;
+; RV64I-NOZBB-LABEL: select_example_umax_1:
+; RV64I-NOZBB:       # %bb.0: # %entry
+; RV64I-NOZBB-NEXT:    bgeu a3, a0, .LBB6_3
+; RV64I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV64I-NOZBB-NEXT:    beqz a2, .LBB6_4
+; RV64I-NOZBB-NEXT:  .LBB6_2: # %entry
+; RV64I-NOZBB-NEXT:    ret
+; RV64I-NOZBB-NEXT:  .LBB6_3: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a3
+; RV64I-NOZBB-NEXT:    bnez a2, .LBB6_2
+; RV64I-NOZBB-NEXT:  .LBB6_4: # %entry
+; RV64I-NOZBB-NEXT:    mv a0, a1
+; RV64I-NOZBB-NEXT:    ret
+;
+; RV32I-SFB-ZBB-LABEL: select_example_umax_1:
+; RV32I-SFB-ZBB:       # %bb.0: # %entry
+; RV32I-SFB-ZBB-NEXT:    sltu a7, a5, a0
+; RV32I-SFB-ZBB-NEXT:    sltu t0, a6, a1
+; RV32I-SFB-ZBB-NEXT:    bne a1, a6, .LBB6_2
+; RV32I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv t0, a7
+; RV32I-SFB-ZBB-NEXT:  .LBB6_2: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez t0, .LBB6_4
+; RV32I-SFB-ZBB-NEXT:  # %bb.3: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a1, a6
+; RV32I-SFB-ZBB-NEXT:  .LBB6_4: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez t0, .LBB6_6
+; RV32I-SFB-ZBB-NEXT:  # %bb.5: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a5
+; RV32I-SFB-ZBB-NEXT:  .LBB6_6: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez a4, .LBB6_8
+; RV32I-SFB-ZBB-NEXT:  # %bb.7: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a0, a2
+; RV32I-SFB-ZBB-NEXT:  .LBB6_8: # %entry
+; RV32I-SFB-ZBB-NEXT:    bnez a4, .LBB6_10
+; RV32I-SFB-ZBB-NEXT:  # %bb.9: # %entry
+; RV32I-SFB-ZBB-NEXT:    mv a1, a3
+; RV32I-SFB-ZBB-NEXT:  .LBB6_10: # %entry
+; RV32I-SFB-ZBB-NEXT:    ret
+;
+; RV64I-SFB-ZBB-LABEL: select_example_umax_1:
+; RV64I-SFB-ZBB:       # %bb.0: # %entry
+; RV64I-SFB-ZBB-NEXT:    beqz a2, .LBB6_2
+; RV64I-SFB-ZBB-NEXT:  # %bb.1: # %entry
+; RV64I-SFB-ZBB-NEXT:    maxu a1, a0, a3
+; RV64I-SFB-ZBB-NEXT:  .LBB6_2: # %entry
+; RV64I-SFB-ZBB-NEXT:    mv a0, a1
+; RV64I-SFB-ZBB-NEXT:    ret
+entry:
+  %res = call i64 @llvm.umax.i64(i64 %a, i64 %y)
+  %sel = select i1 %x, i64 %res, i64 %b
+  ret i64 %sel
+}
+
+define i64 @select_example_umin_1(i64 %a, i64 %b, i1 zeroext %x, i64 %y) {
+; RV32I-NOZBB-LABEL: select_example_umin_1:
+; RV32I-NOZBB:       # %bb.0: # %entry
+; RV32I-NOZBB-NEXT:    beq a1, a6, .LBB7_2
+; RV32I-NOZBB-NEXT:  # %bb.1: # %entry
+; RV32I-NOZBB-NEXT:    sltu a7, a1, a6
+; RV32I-NOZBB-NEXT:    beqz a7, .LBB7_3
+; RV32I-NOZBB-NEXT:    j .LBB7_4
+; RV32I-NOZBB-NEXT:  .LBB7_2:
+; RV32I-NOZBB-NEXT:    sltu a7, a0, a5
+; RV32I-NOZBB-NEXT:    bnez a7, .LBB7_4
+; RV32I-NOZBB-NEXT:...
[truncated]

@github-actions
Copy link

github-actions bot commented Oct 21, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Change-Id: I14c0493b53c643c96ee5cb0ce3a8531f9d33e207
case RISCV::PseudoCCAND: NewOpc = RISCV::AND; break;
case RISCV::PseudoCCOR: NewOpc = RISCV::OR; break;
case RISCV::PseudoCCXOR: NewOpc = RISCV::XOR; break;
case RISCV::PseudoCCMAX:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a consistent coding style. Were the changes here made by clang-format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the changes were made by clang-format. When I had the changes consistent with the coding style here, Then PR check of clang-format failed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a clang-format off/on directive around the switch.

case RISCV::AND: return RISCV::PseudoCCAND; break;
case RISCV::OR: return RISCV::PseudoCCOR; break;
case RISCV::XOR: return RISCV::PseudoCCXOR; break;
case RISCV::MAX:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Were the changes made by clang-format?

; RUN: llc < %s -mtriple=riscv32 -mattr=+zbb,+short-forward-branch-opt | \
; RUN: FileCheck %s --check-prefixes=RV32I-SFB-ZBB
; RUN: llc < %s -mtriple=riscv64 -mattr=+zbb,+short-forward-branch-opt | \
; RUN: FileCheck %s --check-prefixes=RV64I-SFB-ZBB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add run lines with just zbb enabled so that we can see the difference in code when SFB is enabled?

@topperc
Copy link
Collaborator

topperc commented Oct 21, 2025

SiFive cores do not support short forward branch for MIN/MAX.

case RISCV::AND: return RISCV::PseudoCCAND; break;
case RISCV::OR: return RISCV::PseudoCCOR; break;
case RISCV::XOR: return RISCV::PseudoCCXOR; break;
case RISCV::MAX:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really shouldn't have an unreachable break;. I guess that's a mistake I made originally?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch to remove them #164481

@lenary
Copy link
Member

lenary commented Oct 22, 2025

SiFive cores do not support short forward branch for MIN/MAX.

We did it this way to avoid introducing tune features that depend on other tune features, but I see there are more instruction differences than I expected. I hoped that SiFive cores might not have Zbb when they don't do this fusion, but I should have checked closer.

Our plan is to add another feature to enable these SFB cases, which would be required. I was originally thinking of naming it TuneShortForwardBranchOptZbb, but I see there are already Zbb instructions supported by SFB that you presumably don't want disabled if you don't support SFB for min/max (SFB for ANDN, ORN, XNOR).

The hypothetical TuneShortForwardBranchOptZbb would require/imply TuneShortForwardBranchOpt, and the former would be required to SFB min/max.

The idea behind this naming scheme is it should be obvious for extending to other extensions, as I think we will also want it for a TuneShortForwardBranchOptZmmul (or similar), as you've said you don't fuse branches with MUL.

What are your thoughts?

I've asked Harsh to prepare this patch update (as well as other updates), but it would be good to hear you like this direction before we upload the next version.

@topperc
Copy link
Collaborator

topperc commented Oct 22, 2025

SiFive cores do not support short forward branch for MIN/MAX.

We did it this way to avoid introducing tune features that depend on other tune features, but I see there are more instruction differences than I expected. I hoped that SiFive cores might not have Zbb when they don't do this fusion, but I should have checked closer.

Our plan is to add another feature to enable these SFB cases, which would be required. I was originally thinking of naming it TuneShortForwardBranchOptZbb, but I see there are already Zbb instructions supported by SFB that you presumably don't want disabled if you don't support SFB for min/max (SFB for ANDN, ORN, XNOR).

The hypothetical TuneShortForwardBranchOptZbb would require/imply TuneShortForwardBranchOpt, and the former would be required to SFB min/max.

The idea behind this naming scheme is it should be obvious for extending to other extensions, as I think we will also want it for a TuneShortForwardBranchOptZmmul (or similar), as you've said you don't fuse branches with MUL.

What are your thoughts?

I've asked Harsh to prepare this patch update (as well as other updates), but it would be good to hear you like this direction before we upload the next version.

I'm not sure what a good name is. The instructions that aren't supported are the ones that are only available on PipeB in RISCVSchedSiFive7.td. That includes div/rem, mul, ctz, ctz, ctpop, rotate, shXadd, orc.b, bset(i), bclr(i), binv(i).

Maybe we should rename the existing flag to TuneShortForwardBranchOptIALU? IALU being the scheduler class that covers the supported instructions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants