Skip to content

[Codegen] Add pass for interpreting transform specs on lowering configs #20408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 14, 2025

Conversation

qedawkins
Copy link
Contributor

The existing mechanism for externally authored codegen strategies via transform dialect overrides almost the entire codegen pipeline with the transform dialect strategy. This does not scale as it typically either
a) Is no better than a standard pass pipeline, in which case there is no
reason to use transform dialect.
or b) Is specialized for a single case which railroads future graph
level/fusion efforts.

A more scalable middle ground is to thread through transform strategies that apply specifically to certain operations and let the surrounding pipeline handle the rest. This patch adds a pass that walks a module and looks up + applies a transform strategy on any annotated op.

One concern with transform dialect passes has been dialect registration. The only dialect this pass requires is iree_codegen to load the strategy library. Otherwise the current assumption is that this will be used within a TargetBackend's pass pipeline, meaning each target backend adds the dialect that needs to be registered as a part of the surrounding pipeline. I'm not 100% how this will pan out with reproducers but there aren't any good alternatives (registering everything is a no-go).

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this kind of external pluggable support, I always find it useful if there is a sample in https://github.com/iree-org/iree/tree/main/samples/transform_dialect.

It'd be good if we can have an example in-tree, so we don't break it when we bring up new features. E.g., I hit few failures in custom_dispatch samples when I developed the encoding specialization pass.

Comment on lines +95 to +132
};
} // namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: balanced whitespace

Suggested change
};
} // namespace
};
} // namespace

I think we want either

namespace {
class ... {
};
} // namespace

or

namespace {

class ... {
};

} // namespace

The other style nit is that we want to keep unnamed namespace as small as possible. If it were me, I'd move the implementation out of the unnamed namespace. I'm an advocate that we should follow the style in the codebase, and I recently start following the style. The LLVM style guide also recommends it.

My rule is -- if the code snippet is small, e.g., simple constructors, keep it in the class. If it is a reasonable chunk, implement the method outside the definition.

Honestly, there are many codes that do not follow the style, so I think it is optional. If you agree with the point, it's better to make the change.

https://llvm.org/docs/CodingStandards.html#restrict-visibility

The problem with anonymous namespaces is that they naturally want to encourage indentation of their body, and they reduce locality of reference: if you see a random function definition in a C++ file, it is easy to see if it is marked static, but seeing if it is in an anonymous namespace requires scanning a big chunk of the file.

So it can be something like:

namespace {
class LoweringConfigInterpreterPass final
    : public impl::LoweringConfigInterpreterPassBase<
          LoweringConfigInterpreterPass> {
public:
  using impl::LoweringConfigInterpreterPassBase<
      LoweringConfigInterpreterPass>::LoweringConfigInterpreterPassBase;
  void runOnOperation() override ;
};
} // namespace

void LoweringConfigInterpreterPass::runOnOperation {
  // ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been doing this most of the time too, but I got lazy when copying the pass implementation from TransformDialectInterpreter :P

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the benefit of this:

The other style nit is that we want to keep unnamed namespace as small as possible. If it were me, I'd move the implementation out of the unnamed namespace. I'm an advocate that we should follow the style in the codebase, and I recently start following the style. The LLVM style guide also recommends it.

IMO the better option is to have the non-anonymous portion as small as possible, despite what the llvm guide says. I actually wanted to open an RFC on this for a while but haven't found the time.

Copy link
Member

@kuhar kuhar Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me having the runOnOperation out of line seems completely unnecessary and only makes me scroll more.
-2 on this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with anonymous namespaces is that they naturally want to encourage indentation of their body, and they reduce locality of reference: if you see a random function definition in a C++ file, it is easy to see if it is marked static, but seeing if it is in an anonymous namespace requires scanning a big chunk of the file.

This is trivially solvable by marking functions inside anonymous namespace static

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Id suggest not going back and forth on such style issues (for example, I follow what Hanhan suggest cause I hate the extra tabbing that making the definition inline is and also its just nice to have declaration and definition separate), but thats a preference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This really needs to be discussed in other thread/post, though. However, I still want to put a data-point here because discord discussion could be mis-tracked easily (to me).)

@kuhar sent a https://abseil.io/tips/186 link on discord, and I think it is mostly talking about how we make decision between static methods and private methods.

I want to provide another data point that I did not see such discussion before. I think moving static local methods to an unnamed namespace makes review harder when people refactor code. Because the definition can not be in an anonymous namespace.

E.g.,

Header:

// myfunc.h
#ifndef MYFUNC_H
#define MYFUNC_H

void foo();  // Declaration of the function

#endif

Implementation:

// myfunc.cpp
#include "myfunc.h"
#include <iostream>

namespace {
void bar() {
    std::cout << "bar" << std::endl;
}
}  // namespace

void foo() {
    bar();
}

If we need to expose bar() for new needs, you'll need to move the whole bar() implementation out of namespace. People could move it anywhere or split it into two unnamed namespaces. They both look bad to me.

If we follow the static approach, the diff is just removing the static keyword in the cpp file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still put static in front of functions even in anonymous namespaces to make it immediately obvious it's private. And arguably, having all public functions at the bottom makes it easy to tell apart the private implementation (in the anonymous namespace) vs. public API at the bottom.

With you typical mlir code, you will find that the vast majority of the implementation is private with a handful of public functions to populate rewrite patterns etc.

My issue with reducing the scope of anonymous namespaces is that it introduces duplication and noise by opening/closing the anonymous namespace.

For example, what I suggest (and what this PR does):

#include "myfunc.h"
#include <iostream>

namespace {
static void bar() {
    std::cout << "bar" << std::endl;
}

struct MyRewritePattern1 {
  void matchAndRewrite() {
   ...
  }
}

static void baz() {
    std::cout << "baz" << std::endl;
}

struct MyRewritePattern2 {
  void matchAndRewrite() {
   ...
  }
}
}  // namespace

// --- PUBLIC API --- //

void foo() {
    // use private impl
}

What you/llvm suggest:

#include "myfunc.h"
#include <iostream>

static void bar() {
    std::cout << "bar" << std::endl;
}

namespace {
struct MyRewritePattern1 {
  void matchAndRewrite();
}
} // namespace

void MyRewritePattern1::matchAndRewrite() {

}

static void baz() {
    std::cout << "baz" << std::endl;
}

namespace {
struct MyRewritePattern2 {
  void matchAndRewrite();
}
}  // namespace

void MyRewritePattern2::matchAndRewrite() {
 ...
}

void foo() {
    // use private impl
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that with this single large anonymous namespace you can still put the definition out-of-line to reduce indentation if you choose to.

@qedawkins
Copy link
Contributor Author

For this kind of external pluggable support, I always find it useful if there is a sample in https://github.com/iree-org/iree/tree/main/samples/transform_dialect.

I was thinking of sending that as a separate PR once I'm done iterating on pingpong. That might be a little heavyweight for a sample though so I can add something simpler, sounds like a good idea I'll do that.

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall

Comment on lines +95 to +132
};
} // namespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the benefit of this:

The other style nit is that we want to keep unnamed namespace as small as possible. If it were me, I'd move the implementation out of the unnamed namespace. I'm an advocate that we should follow the style in the codebase, and I recently start following the style. The LLVM style guide also recommends it.

IMO the better option is to have the non-anonymous portion as small as possible, despite what the llvm guide says. I actually wanted to open an RFC on this for a while but haven't found the time.

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach. Have a few questions. Overall looks like a direction to start with but there are still some open questions.

@qedawkins qedawkins force-pushed the lowering_config_interp branch from f0bd6e9 to b53e738 Compare April 3, 2025 19:25
@qedawkins qedawkins requested a review from benvanik as a code owner April 3, 2025 19:25
@qedawkins qedawkins force-pushed the lowering_config_interp branch 3 times, most recently from 73dacd4 to 987927a Compare April 7, 2025 14:29
@qedawkins
Copy link
Contributor Author

I went through a few iterations for how to import external symbols and ended up requiring the user to do it. In the future we're going to want to be able to support multiple functions in codegen but that's not close to working. For now this is the best I can come up with.

I addressed all comments except the namespace one because I'm unsure what to do there. This is ready for another round of reviews!

Copy link
Contributor

@Groverkss Groverkss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is cool! I'm going to let others approve since they had commenst before

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to hit 'submit review' yesterday.
Looks good overall.

@qedawkins qedawkins requested review from Groverkss and kuhar April 14, 2025 14:21
@qedawkins qedawkins enabled auto-merge (squash) April 14, 2025 22:13
The existing mechanism for externally authored codegen strategies via
transform dialect overrides almost the entire codegen pipeline with the
transform dialect strategy. This does not scale as it typically either
   a) Is no better than a standard pass pipeline, in which case there is no
      reason to use transform dialect.
or b) Is specialized for a single case which railroads future graph
      level/fusion efforts.

A more scalable middle ground is to thread through transform strategies
that apply specifically to certain operations and let the surrounding
pipeline handle the rest. This patch adds a pass that walks a module and
looks up + applies a transform strategy on any annotated op.

One concern with transform dialect passes has been dialect registration.
The only dialect this pass *requires* is iree_codegen to load the
strategy library. Otherwise the current assumption is that this will be
used within a TargetBackend's pass pipeline, meaning each target backend
adds the dialect that needs to be registered as a part of the
surrounding pipeline. I'm not 100% how this will pan out with
reproducers but there aren't any good alternatives (registering
everything is a no-go).
@qedawkins qedawkins force-pushed the lowering_config_interp branch from b78f12b to 8680982 Compare April 14, 2025 22:30
@qedawkins qedawkins merged commit 53e15a4 into iree-org:main Apr 14, 2025
41 checks passed
@qedawkins qedawkins deleted the lowering_config_interp branch April 14, 2025 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants