[AIEX] Iterative feedback-driven post-pipeliner #359

gbossu · 2025-02-17T14:13:46Z

This uses feedback from previous iteration of a strategy to tweak/tighten the [Earliest, Latest) range of instructions until a solution is found.

This is needed to reach the optimal II for Conv2D_bfp16 in AIE2p

martien-de-jong · 2025-02-18T09:32:25Z

llvm/lib/Target/AIE/AIE2Subtarget.h

@@ -59,6 +59,9 @@ class AIE2Subtarget : public AIE2GenSubtargetInfo, public AIEBaseSubtarget {
                StringRef FS, StringRef ABIName, const TargetMachine &TM);

  bool enableMachineScheduler() const override { return true; }
+  bool enableMachinePipeliner() const override {
+    return AIEBaseSubtarget::enableMachinePipeliner();
+  }


CHECK: we just disable the pre-pipeliner, not the prescheduler. And 'forcing' assumes infinite willingness on the part of the postpipeliner.

... but the prescheduler follows the pre-pipeliner

Correct, I'll add a small comment

martien-de-jong · 2025-02-18T09:59:53Z

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

+    auto [It, Inserted] = UniqueAncestors.insert(P);
+    if (Inserted) {
+      Slots += Pred.Slots;
+      Count++;


Nit: Count could be replaced with UniqueAncestors.size()

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

martien-de-jong · 2025-02-18T10:51:07Z

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

@@ -633,6 +633,16 @@ bool PostPipeliner::scheduleOtherIterations(PostPipelinerStrategy &Strategy) {
  return true;
 }

+int getMinOutputLat(ArrayRef<SDep> Nodes) {


nit: More descriptive name for Nodes. SuccDeps?

I mean, this is a general helper that just returns the minimum output latency out of the given edges. So I think it makes sense to keep the parameter generic as well. I can maybe rename to Edges?

Ok, makes sense, Edges or Deps would probably not have triggered my comment.

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

martien-de-jong · 2025-02-18T11:04:42Z

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

    {1, true, false, HeuristicRuns, {Prio::Critical, Prio::LCDLatest}},
+    {1, true, false, HeuristicRuns, {Prio::Liveness, Prio::Latest}},
+    {1, true, false, HeuristicRuns, {Prio::Latest, Prio::Liveness}},
    // Bottom-up strategies
    {0, false, false, 2, {Prio::Critical, Prio::LCDLatest}},
    {1, false, false, 2, {Prio::Critical, Prio::LCDLatest}},


TODO: we should probably weed and sort with respect to effectiveness at some point.
In particular, I hope that just NodeNum would be one of the lesser effective ones, and should be moved down. Also Critical + Latest might cover all of just Critical.

Yes, one of my plans would be to add "optimization remarks" for whatever strategy was picked. Then we can derive what works better.

I could also maybe have a mode that runs all of the heuristics for a given II, even after one has succeeded. The point would be to find the one that converges faster.

(In a future PR, maybe 😄)

Yeah, the latter one would be nice. it would list all heuristics that found the best II. Totalling that over a number of represesentative benchmarks would give a good score.

martien-de-jong

I see no real problems. Shout if have finalized for a formal approval.

F-Stuckmann · 2025-02-18T14:08:41Z

llvm/lib/Target/AIE/AIEMachineScheduler.cpp

-           AIELoopUtils::getPipelinerDisabled(*Block);
+    if (!Block)
+      return false;
+    bool PrePipelinerDisabled =


nit: const bool PrePipelinerDisabled

F-Stuckmann · 2025-02-18T14:14:34Z

llvm/test/CodeGen/AIE/aie2p/end-to-end/conv2d_bfp16_convert.ll

+!1 = !{i32 2, !"Debug Info Version", i32 3}
+!2 = !{i32 1, !"wchar_size", i32 4}
+!4 = !{!5, !6, i64 4}
+!5 = !{!"_ZTS13BfToBfpParams", !6, i64 0, !6, i64 4, !6, i64 8, !6, i64 12}


nit: update function name

F-Stuckmann · 2025-02-18T14:15:33Z

llvm/test/CodeGen/AIE/aie2p/end-to-end/conv2d_bfp16_kernel_red.ll

+!1 = !{i32 2, !"Debug Info Version", i32 3}
+!2 = !{i32 1, !"wchar_size", i32 4}
+!4 = !{!5}
+!5 = distinct !{!5, !6, !"_Z14conv2d_genericILh1EL5act_t0ELb0ELb0EEvPu6__bf16S1_S1_S1_R27conv2d_bf16_internal_params10out_mode_t: %input"}


nit: update function name

F-Stuckmann · 2025-02-18T14:21:22Z

llvm/test/CodeGen/AIE/aie2/schedule/postpipeliner/conv2d_bf16-2.mir

+
+
+# derived from conv2d_bf16_0
+# Same allocation


CHECK: the difference between conv2d_f16.mir and this file seems to be that the WAW dependencies now have "renamed" and "killed" attributes. Thus we don't have to cycle through our pointers, correct?
Maybe add this to the comment

I think this example represents whatever the LRU register re-allocator gave us. I'll update the comment. I don't even think we can reach the optimal II here.

F-Stuckmann · 2025-02-18T14:41:29Z

llvm/test/CodeGen/AIE/aie2/schedule/postpipeliner/round-memdep.mir

  ; CHECK-NEXT:    nop
  ; CHECK-NEXT:    nop
  ; CHECK-NEXT:    vlda.ups.s32.s8 cm0, s0, [p0], #32
  ; CHECK-NEXT:    vlda.ups.s32.s8 cm1, s0, [p0], #32
  ; CHECK-NEXT:    nop
-  ; CHECK-NEXT:    add.nc lc, r0, #-4
+  ; CHECK-NEXT:    add.nc lc, r0, #-5


check: we reduced pipeline stages here correct?

We now have one more stage, and I am still trying to chase down why that happens

andcarminati · 2025-02-19T11:28:26Z

llvm/lib/Target/AIE/AIE2Subtarget.h

@@ -59,6 +59,9 @@ class AIE2Subtarget : public AIE2GenSubtargetInfo, public AIEBaseSubtarget {
                StringRef FS, StringRef ABIName, const TargetMachine &TM);

  bool enableMachineScheduler() const override { return true; }
+  bool enableMachinePipeliner() const override {


Can we keep just the base implementation?

Unfortunately AIEBaseSubTarget isn't actually a base class of AIE2Subtarget

andcarminati · 2025-02-19T11:54:44Z

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

@@ -196,44 +196,76 @@ int PostPipeliner::fit(MachineInstr *MI, int First, int Last, int II) {
  return -1;
 }

+void PostPipeliner::biasForLocalResourceContention(NodeInfo &NI,


It could be nice to have a description here, as the post pipeliner is getting bigger.

These are bf16/bfp16 variants for AIE2 or AIE2p

This is to give full access to the Info array and it's associated parameters Co-authored-by: Martien de Jong <[email protected]>

Dump intervals in ascii art

An SU can appear multiple times in the list of preds/succs.

When an iteration does not converge, a problematic instruciton will be identified, and its [Earliest,Latest) range will be tightened.

martien-de-jong

Yes, let's get experience with this.

gbossu requested review from abhinay-anubola, abnikant, andcarminati, F-Stuckmann, katerynamuts, khallouh, konstantinschwarz, martien-de-jong, niwinanto, SagarMaheshwari99 and stephenneuendorffer as code owners February 17, 2025 14:13

gbossu force-pushed the gaetan.postpipeliner.iterative branch from 165bb63 to 6af25cb Compare February 17, 2025 14:19

F-Stuckmann mentioned this pull request Feb 17, 2025

Stuckmann.multi.slot.pseudos #304

Closed

gbossu changed the title ~~[AIEX] Iterative feedback-drive post-pipeliner~~ [AIEX] Iterative feedback-driven post-pipeliner Feb 18, 2025

martien-de-jong reviewed Feb 18, 2025

View reviewed changes

llvm/lib/Target/AIE/AIEPostPipeliner.cpp Show resolved Hide resolved

martien-de-jong reviewed Feb 18, 2025

View reviewed changes

llvm/lib/Target/AIE/AIEPostPipeliner.cpp Show resolved Hide resolved

martien-de-jong reviewed Feb 18, 2025

View reviewed changes

F-Stuckmann reviewed Feb 18, 2025

View reviewed changes

gbossu force-pushed the gaetan.postpipeliner.iterative branch from 6af25cb to 584bec1 Compare February 19, 2025 11:01

andcarminati reviewed Feb 19, 2025

View reviewed changes

[AIEX] Option to force post-pipelining

d90fec7

gbossu and others added 8 commits February 19, 2025 15:51

[AIE2p] End-to-end conv2d_bfp16 tests

0eb7770

[AIEX] PostPipeliner: more Conv2D tests

d8c1dc7

These are bf16/bfp16 variants for AIE2 or AIE2p

[AIE] Pack NodeInfo in a class with additional information

4eb9d7d

This is to give full access to the Info array and it's associated parameters Co-authored-by: Martien de Jong <[email protected]>

[AIE] Push Earliest with local resource contention

21753fa

Dump intervals in ascii art

[AIEX] PostPipeliner: Look at unique preds for resource contention

3750708

An SU can appear multiple times in the list of preds/succs.

[AIEX][NFC] PostPipeliner: more logging to indentify difficult MIs

bafc7be

[AIEX] PostPipeliner: feedback-driven iterative scheduling

fea2cc9

When an iteration does not converge, a problematic instruciton will be identified, and its [Earliest,Latest) range will be tightened.

[AIEX] PostPipeliner: More heuristics

d10386e

gbossu force-pushed the gaetan.postpipeliner.iterative branch from 584bec1 to d10386e Compare February 19, 2025 16:22

martien-de-jong approved these changes Feb 20, 2025

View reviewed changes

gbossu merged commit 0f7b86b into aie-public Feb 20, 2025
6 checks passed

gbossu deleted the gaetan.postpipeliner.iterative branch February 20, 2025 10:24

[AIEX] Iterative feedback-driven post-pipeliner #359

[AIEX] Iterative feedback-driven post-pipeliner #359

Uh oh!

Conversation

gbossu commented Feb 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gbossu Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martien-de-jong left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andcarminati Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martien-de-jong left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gbossu Feb 18, 2025 •

edited

Loading

andcarminati Feb 19, 2025 •

edited

Loading