Skip to content

[CIR] Upstream support for operator assign #145979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions clang/include/clang/CIR/MissingFeatures.h
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ struct MissingFeatures {
static bool alignCXXRecordDecl() { return false; }
static bool armComputeVolatileBitfields() { return false; }
static bool asmLabelAttr() { return false; }
static bool assignMemcpyizer() { return false; }
static bool astVarDeclInterface() { return false; }
static bool attributeNoBuiltin() { return false; }
static bool bitfields() { return false; }
Expand Down
37 changes: 14 additions & 23 deletions clang/lib/CIR/CodeGen/CIRGenCXXExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -85,23 +85,24 @@ RValue CIRGenFunction::emitCXXMemberOrOperatorMemberCallExpr(
return RValue::get(nullptr);
}

bool trivialForCodegen =
md->isTrivial() || (md->isDefaulted() && md->getParent()->isUnion());
bool trivialAssignment =
trivialForCodegen &&
(md->isCopyAssignmentOperator() || md->isMoveAssignmentOperator()) &&
!md->getParent()->mayInsertExtraPadding();
(void)trivialAssignment;
// Note on trivial assignment
// --------------------------
// Classic codegen avoids generating the trivial copy/move assignment operator
// when it isn't necessary, choosing instead to just produce IR with an
// equivalent effect. We have chosen not to do that in CIR, instead emitting
// trivial copy/move assignment operators and allowing later transformations
// to optimize them away if appropriate.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this FWIW, and I think we should do MORE of this if we can get away with it. The 'premature optimizations' in the FE are only there because LLVM-IR isn't expressive enough to make it clear it is legal. CIR should make sure it is expressive enough for these situations.


// C++17 demands that we evaluate the RHS of a (possibly-compound) assignment
// operator before the LHS.
CallArgList rtlArgStorage;
CallArgList *rtlArgs = nullptr;
if (auto *oce = dyn_cast<CXXOperatorCallExpr>(ce)) {
if (oce->isAssignmentOp()) {
cgm.errorNYI(
oce->getSourceRange(),
"emitCXXMemberOrOperatorMemberCallExpr: assignment operator");
rtlArgs = &rtlArgStorage;
emitCallArgs(*rtlArgs, md->getType()->castAs<FunctionProtoType>(),
drop_begin(ce->arguments(), 1), ce->getDirectCallee(),
/*ParamsToSkip*/ 0);
}
}

Expand All @@ -121,19 +122,9 @@ RValue CIRGenFunction::emitCXXMemberOrOperatorMemberCallExpr(
return RValue::get(nullptr);
}

if (trivialForCodegen) {
if (isa<CXXDestructorDecl>(md))
return RValue::get(nullptr);

if (trivialAssignment) {
cgm.errorNYI(ce->getSourceRange(),
"emitCXXMemberOrOperatorMemberCallExpr: trivial assignment");
return RValue::get(nullptr);
}

assert(md->getParent()->mayInsertExtraPadding() &&
"unknown trivial member function");
}
if ((md->isTrivial() || (md->isDefaulted() && md->getParent()->isUnion())) &&
isa<CXXDestructorDecl>(md))
return RValue::get(nullptr);

// Compute the function type we're calling
const CXXMethodDecl *calleeDecl = md;
Expand Down
25 changes: 25 additions & 0 deletions clang/lib/CIR/CodeGen/CIRGenClass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,31 @@ void CIRGenFunction::emitDelegateCXXConstructorCall(
/*Delegating=*/true, thisAddr, delegateArgs, loc);
}

void CIRGenFunction::emitImplicitAssignmentOperatorBody(FunctionArgList &args) {
const auto *assignOp = cast<CXXMethodDecl>(curGD.getDecl());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(assignOp->isCopyAssignmentOperator() || assignOp->isMoveAssignmentOperator());

assert(assignOp->isCopyAssignmentOperator() ||
assignOp->isMoveAssignmentOperator());
const Stmt *rootS = assignOp->getBody();
assert(isa<CompoundStmt>(rootS) &&
"Body of an implicit assignment operator should be compound stmt.");
const auto *rootCS = cast<CompoundStmt>(rootS);

assert(!cir::MissingFeatures::incrementProfileCounter());
assert(!cir::MissingFeatures::runCleanupsScope());

// Classic codegen uses a special class to attempt to replace member
// initializers with memcpy. We could possibly defer that to the
// lowering or optimization phases to keep the memory accesses more
// explicit. For now, we don't insert memcpy at all, though in some
// cases the AST contains a call to memcpy.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is another case where CIR should be more expressive, and the FE shouldn't be doing this calculation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that the front end shouldn't be putting the memcpy call in the AST? That's happening for this case, which was added to the classic codegen version of the assign-operator.cpp test when the Sema code to generate the memcpy call was added:

https://godbolt.org/z/xMYPevMMf

There's a comment (which predates the change where this test was added) saying "This optimization only applies for arrays of scalars, and for arrays of class type where the selected copy/move-assignment operator is trivial." So I guess the presence of B b[16] in the test is what triggers it. I can see why memcpy is used there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it shouldn't. CIR should be the one doing the ->memcpy translation, not the FE. Classic-codegen does it because it has to, CIR-codegen, shouldn't.

Note the above is a general comment, not on this patch/shouldn't block this patch. Point is in general, the premature-optimization of 'implement as memcpy' is going to inhibit useful opt opportunities and reduce the expressiveness of the IR. We should avoid that if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not even convinced clang is always doing things like this because they couldn't be done in LLVM IR. I think sometimes it just seemed like a good idea at the time. For example, here if we just emitted a loop with calls to a B copy assignment operator that copied the single data member, the optimizer would inline the calls and turn it into a memcpy (https://godbolt.org/z/Yj4M4doYv).

assert(!cir::MissingFeatures::assignMemcpyizer());
for (Stmt *s : rootCS->body())
if (emitStmt(s, /*useCurrentScope=*/true).failed())
cgm.errorNYI(s->getSourceRange(),
std::string("emitImplicitAssignmentOperatorBody: ") +
s->getStmtClassName());
}

void CIRGenFunction::emitDelegatingCXXConstructorCall(
const CXXConstructorDecl *ctor, const FunctionArgList &args) {
assert(ctor->isDelegatingConstructor());
Expand Down
24 changes: 13 additions & 11 deletions clang/lib/CIR/CodeGen/CIRGenFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -462,21 +462,23 @@ cir::FuncOp CIRGenFunction::generateCode(clang::GlobalDecl gd, cir::FuncOp fn,

startFunction(gd, retTy, fn, funcType, args, loc, bodyRange.getBegin());

if (isa<CXXDestructorDecl>(funcDecl))
if (isa<CXXDestructorDecl>(funcDecl)) {
getCIRGenModule().errorNYI(bodyRange, "C++ destructor definition");
else if (isa<CXXConstructorDecl>(funcDecl))
} else if (isa<CXXConstructorDecl>(funcDecl)) {
emitConstructorBody(args);
else if (getLangOpts().CUDA && !getLangOpts().CUDAIsDevice &&
funcDecl->hasAttr<CUDAGlobalAttr>())
} else if (getLangOpts().CUDA && !getLangOpts().CUDAIsDevice &&
funcDecl->hasAttr<CUDAGlobalAttr>()) {
getCIRGenModule().errorNYI(bodyRange, "CUDA kernel");
else if (isa<CXXMethodDecl>(funcDecl) &&
cast<CXXMethodDecl>(funcDecl)->isLambdaStaticInvoker())
} else if (isa<CXXMethodDecl>(funcDecl) &&
cast<CXXMethodDecl>(funcDecl)->isLambdaStaticInvoker()) {
getCIRGenModule().errorNYI(bodyRange, "Lambda static invoker");
else if (funcDecl->isDefaulted() && isa<CXXMethodDecl>(funcDecl) &&
(cast<CXXMethodDecl>(funcDecl)->isCopyAssignmentOperator() ||
cast<CXXMethodDecl>(funcDecl)->isMoveAssignmentOperator()))
getCIRGenModule().errorNYI(bodyRange, "Default assignment operator");
else if (body) {
} else if (funcDecl->isDefaulted() && isa<CXXMethodDecl>(funcDecl) &&
(cast<CXXMethodDecl>(funcDecl)->isCopyAssignmentOperator() ||
cast<CXXMethodDecl>(funcDecl)->isMoveAssignmentOperator())) {
// Implicit copy-assignment gets the same special treatment as implicit
// copy-constructors.
emitImplicitAssignmentOperatorBody(args);
} else if (body) {
if (mlir::failed(emitFunctionBody(body))) {
fn.erase();
return nullptr;
Expand Down
2 changes: 2 additions & 0 deletions clang/lib/CIR/CodeGen/CIRGenFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -867,6 +867,8 @@ class CIRGenFunction : public CIRGenTypeCache {

mlir::LogicalResult emitFunctionBody(const clang::Stmt *body);

void emitImplicitAssignmentOperatorBody(FunctionArgList &args);

void emitInitializerForField(clang::FieldDecl *field, LValue lhs,
clang::Expr *init);

Expand Down
6 changes: 0 additions & 6 deletions clang/lib/CIR/CodeGen/CIRGenModule.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -392,12 +392,6 @@ void CIRGenModule::emitGlobal(clang::GlobalDecl gd) {
void CIRGenModule::emitGlobalFunctionDefinition(clang::GlobalDecl gd,
mlir::Operation *op) {
auto const *funcDecl = cast<FunctionDecl>(gd.getDecl());
if (funcDecl->getIdentifier() == nullptr) {
errorNYI(funcDecl->getSourceRange().getBegin(),
"function definition with a non-identifier for a name");
return;
}

const CIRGenFunctionInfo &fi = getTypes().arrangeGlobalDeclaration(gd);
cir::FuncType funcType = getTypes().getFunctionType(fi);
cir::FuncOp funcOp = dyn_cast_if_present<cir::FuncOp>(op);
Expand Down
144 changes: 144 additions & 0 deletions clang/test/CIR/CodeGen/assign-operator.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
// RUN: %clang_cc1 -std=c++11 -triple aarch64-none-linux-android21 -fclangir -emit-cir %s -o %t.cir
// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
// RUN: %clang_cc1 -std=c++11 -triple aarch64-none-linux-android21 -fclangir -emit-llvm %s -o %t-cir.ll
// RUN: FileCheck --check-prefix=LLVM --input-file=%t-cir.ll %s
// RUN: %clang_cc1 -std=c++11 -triple aarch64-none-linux-android21 -emit-llvm %s -o %t.ll
// RUN: FileCheck --check-prefix=OGCG --input-file=%t.ll %s

class x {
public: int operator=(int);
};
void a() {
x a;
a = 1u;
}

// CIR: cir.func private @_ZN1xaSEi(!cir.ptr<!rec_x>, !s32i)
// CIR: cir.func{{.*}} @_Z1av()
// CIR: %[[A_ADDR:.*]] = cir.alloca !rec_x, !cir.ptr<!rec_x>, ["a"]
// CIR: %[[ONE:.*]] = cir.const #cir.int<1> : !u32i
// CIR: %[[ONE_CAST:.*]] = cir.cast(integral, %[[ONE]] : !u32i), !s32i
// CIR: %[[RET:.*]] = cir.call @_ZN1xaSEi(%[[A_ADDR]], %[[ONE_CAST]]) : (!cir.ptr<!rec_x>, !s32i) -> !s32i

// LLVM: define{{.*}} @_Z1av()
// OGCG: define{{.*}} @_Z1av()

void f(int i, int j) {
(i += j) = 17;
}

// CIR: cir.func{{.*}} @_Z1fii(%arg0: !s32i {{.*}}, %arg1: !s32i {{.*}})
// CIR: %[[I_ADDR:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["i", init]
// CIR: %[[J_ADDR:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["j", init]
// CIR: cir.store %arg0, %[[I_ADDR]] : !s32i, !cir.ptr<!s32i>
// CIR: cir.store %arg1, %[[J_ADDR]] : !s32i, !cir.ptr<!s32i>
// CIR: %[[SEVENTEEN:.*]] = cir.const #cir.int<17> : !s32i
// CIR: %[[J_LOAD:.*]] = cir.load align(4) %[[J_ADDR]] : !cir.ptr<!s32i>, !s32i
// CIR: %[[I_LOAD:.*]] = cir.load align(4) %[[I_ADDR]] : !cir.ptr<!s32i>, !s32i
// CIR: %[[ADD:.*]] = cir.binop(add, %[[I_LOAD]], %[[J_LOAD]]) nsw : !s32i
// CIR: cir.store align(4) %[[ADD]], %[[I_ADDR]] : !s32i, !cir.ptr<!s32i>
// CIR: cir.store align(4) %[[SEVENTEEN]], %[[I_ADDR]] : !s32i, !cir.ptr<!s32i>
// CIR: cir.return

// Ensure that we use memcpy when we would have selected a trivial assignment
// operator, even for a non-trivially-copyable type.
struct A {
A &operator=(const A&);
};
struct B {
B(const B&);
B &operator=(const B&) = default;
int n;
};
struct C {
A a;
B b[16];
};
void copy_c(C &c1, C &c2) {
c1 = c2;
}

// CIR: cir.func private @_ZN1AaSERKS_(!cir.ptr<!rec_A>, !cir.ptr<!rec_A>) -> !cir.ptr<!rec_A>
// CIR: cir.func private @memcpy(!cir.ptr<!void>, !cir.ptr<!void>, !u64i) -> !cir.ptr<!void>

// Implicit assignment operator for C.

// CIR: cir.func comdat linkonce_odr @_ZN1CaSERKS_(%arg0: !cir.ptr<!rec_C> {{.*}}, %arg1: !cir.ptr<!rec_C> {{.*}}) -> !cir.ptr<!rec_C>
// CIR: %[[THIS_ADDR:.*]] = cir.alloca !cir.ptr<!rec_C>, !cir.ptr<!cir.ptr<!rec_C>>, ["this", init]
// CIR: %[[ARG1_ADDR:.*]] = cir.alloca !cir.ptr<!rec_C>, !cir.ptr<!cir.ptr<!rec_C>>, ["", init, const]
// CIR: %[[RET_ADDR:.*]] = cir.alloca !cir.ptr<!rec_C>, !cir.ptr<!cir.ptr<!rec_C>>, ["__retval"]
// CIR: cir.store %arg0, %[[THIS_ADDR]]
// CIR: cir.store %arg1, %[[ARG1_ADDR]]
// CIR: %[[THIS:.*]] = cir.load{{.*}} %[[THIS_ADDR]]
// CIR: %[[A_MEMBER:.*]] = cir.get_member %[[THIS]][0] {name = "a"}
// CIR: %[[ARG1_LOAD:.*]] = cir.load{{.*}} %[[ARG1_ADDR]]
// CIR: %[[A_MEMBER_2:.*]] = cir.get_member %[[ARG1_LOAD]][0] {name = "a"}
// CIR: %[[C_A:.*]] = cir.call @_ZN1AaSERKS_(%[[A_MEMBER]], %[[A_MEMBER_2]])
// CIR: %[[B_MEMBER:.*]] = cir.get_member %[[THIS]][1] {name = "b"}
// CIR: %[[B_VOID_PTR:.*]] = cir.cast(bitcast, %[[B_MEMBER]] : !cir.ptr<!cir.array<!rec_B x 16>>), !cir.ptr<!void>
// CIR: %[[RET_LOAD:.*]] = cir.load %[[ARG1_ADDR]]
// CIR: %[[B_MEMBER_2:.*]] = cir.get_member %[[RET_LOAD]][1] {name = "b"}
// CIR: %[[B_VOID_PTR_2:.*]] = cir.cast(bitcast, %[[B_MEMBER_2]] : !cir.ptr<!cir.array<!rec_B x 16>>), !cir.ptr<!void>
// CIR: %[[SIZE:.*]] = cir.const #cir.int<64> : !u64i
// CIR: %[[COUNT:.*]] = cir.call @memcpy(%[[B_VOID_PTR]], %[[B_VOID_PTR_2]], %[[SIZE]])
// CIR: cir.store %[[THIS]], %[[RET_ADDR]]
// CIR: %[[RET_VAL:.*]] = cir.load{{.*}} %[[RET_ADDR]]
// CIR: cir.return %[[RET_VAL]]

// CIR: cir.func{{.*}} @_Z6copy_cR1CS0_(%arg0: !cir.ptr<!rec_C> {{.*}}, %arg1: !cir.ptr<!rec_C> {{.*}})
// CIR: %[[C1_ADDR:.*]] = cir.alloca !cir.ptr<!rec_C>, !cir.ptr<!cir.ptr<!rec_C>>, ["c1", init, const]
// CIR: %[[C2_ADDR:.*]] = cir.alloca !cir.ptr<!rec_C>, !cir.ptr<!cir.ptr<!rec_C>>, ["c2", init, const]
// CIR: cir.store %arg0, %[[C1_ADDR]]
// CIR: cir.store %arg1, %[[C2_ADDR]]
// CIR: %[[C2_LOAD:.*]] = cir.load{{.*}} %[[C2_ADDR]]
// CIR: %[[C1_LOAD:.*]] = cir.load{{.*}} %[[C1_ADDR]]
// CIR: %[[RET:.*]] = cir.call @_ZN1CaSERKS_(%[[C1_LOAD]], %[[C2_LOAD]])

struct D {
D &operator=(const D&);
};
struct E {
D &get_d_ref() { return d; }
private:
D d;
};

void copy_ref_to_ref(E &e1, E &e2) {
e1.get_d_ref() = e2.get_d_ref();
}

// The call to e2.get_d_ref() must occur before the call to e1.get_d_ref().

// CIR: cir.func{{.*}} @_Z15copy_ref_to_refR1ES0_(%arg0: !cir.ptr<!rec_E> {{.*}}, %arg1: !cir.ptr<!rec_E> {{.*}})
// CIR: %[[E1_ADDR:.*]] = cir.alloca !cir.ptr<!rec_E>, !cir.ptr<!cir.ptr<!rec_E>>, ["e1", init, const]
// CIR: %[[E2_ADDR:.*]] = cir.alloca !cir.ptr<!rec_E>, !cir.ptr<!cir.ptr<!rec_E>>, ["e2", init, const]
// CIR: cir.store %arg0, %[[E1_ADDR]] : !cir.ptr<!rec_E>, !cir.ptr<!cir.ptr<!rec_E>>
// CIR: cir.store %arg1, %[[E2_ADDR]] : !cir.ptr<!rec_E>, !cir.ptr<!cir.ptr<!rec_E>>
// CIR: %[[E2:.*]] = cir.load %[[E2_ADDR]]
// CIR: %[[D2_REF:.*]] = cir.call @_ZN1E9get_d_refEv(%[[E2]])
// CIR: %[[E1:.*]] = cir.load %[[E1_ADDR]]
// CIR: %[[D1_REF:.*]] = cir.call @_ZN1E9get_d_refEv(%[[E1]])
// CIR: %[[D1_REF_2:.*]] = cir.call @_ZN1DaSERKS_(%[[D1_REF]], %[[D2_REF]])
// CIR: cir.return

// LLVM: define{{.*}} void @_Z15copy_ref_to_refR1ES0_(ptr %[[ARG0:.*]], ptr %[[ARG1:.*]]) {
// LLVM: %[[E1_ADDR:.*]] = alloca ptr
// LLVM: %[[E2_ADDR:.*]] = alloca ptr
// LLVM: store ptr %[[ARG0]], ptr %[[E1_ADDR]]
// LLVM: store ptr %[[ARG1]], ptr %[[E2_ADDR]]
// LLVM: %[[E2:.*]] = load ptr, ptr %[[E2_ADDR]]
// LLVM: %[[D2_REF:.*]] = call ptr @_ZN1E9get_d_refEv(ptr %[[E2]])
// LLVM: %[[E1:.*]] = load ptr, ptr %[[E1_ADDR]]
// LLVM: %[[D1_REF:.*]] = call ptr @_ZN1E9get_d_refEv(ptr %[[E1]])
// LLVM: %[[D1_REF_2:.*]] = call ptr @_ZN1DaSERKS_(ptr %[[D1_REF]], ptr %[[D2_REF]])

// OGCG: define{{.*}} void @_Z15copy_ref_to_refR1ES0_(ptr{{.*}} %[[ARG0:.*]], ptr{{.*}} %[[ARG1:.*]])
// OGCG: %[[E1_ADDR:.*]] = alloca ptr
// OGCG: %[[E2_ADDR:.*]] = alloca ptr
// OGCG: store ptr %[[ARG0]], ptr %[[E1_ADDR]]
// OGCG: store ptr %[[ARG1]], ptr %[[E2_ADDR]]
// OGCG: %[[E2:.*]] = load ptr, ptr %[[E2_ADDR]]
// OGCG: %[[D2_REF:.*]] = call{{.*}} ptr @_ZN1E9get_d_refEv(ptr{{.*}} %[[E2]])
// OGCG: %[[E1:.*]] = load ptr, ptr %[[E1_ADDR]]
// OGCG: %[[D1_REF:.*]] = call{{.*}} ptr @_ZN1E9get_d_refEv(ptr{{.*}} %[[E1]])
// OGCG: %[[D1_REF_2:.*]] = call{{.*}} ptr @_ZN1DaSERKS_(ptr{{.*}} %[[D1_REF]], ptr{{.*}} %[[D2_REF]])
Loading