Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 247 additions & 0 deletions proposals/p6254.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
# Calling C++ Functions

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/6254)

<!-- toc -->

## Table of contents

- [Abstract](#abstract)
- [Problem](#problem)
- [Background](#background)
- [Proposal](#proposal)
- [Details](#details)
- [Importing C++ Functions](#importing-c-functions)
- [Overload Resolution](#overload-resolution)
- [Direct Calls versus Thunks](#direct-calls-versus-thunks)
- [Thunk Generation](#thunk-generation)
- [Parameter and Return Value Handling](#parameter-and-return-value-handling)
- [Member Function Calls](#member-function-calls)
- [Operator Calls](#operator-calls)
- [Rationale](#rationale)
- [Alternatives considered](#alternatives-considered)
- [Require Manual C++ Wrappers](#require-manual-c-wrappers)
- [Mandate Carbon ABI Compatibility with C++](#mandate-carbon-abi-compatibility-with-c)

<!-- tocstop -->

## Abstract

This proposal details the mechanism for calling imported C++ functions from
Carbon code. It covers how C++ overload sets are handled, the process of
overload resolution leveraging Clang, and the generation of "thunks" –
intermediate functions – when necessary to bridge Application Binary Interface
(ABI) differences between Carbon and C++.

## Problem

Seamless, high-performance interoperability with C++
[is a fundamental goal of Carbon](https://github.com/carbon-language/carbon-lang/blob/f9bd01536b97961039257cc10fb20b495f7a9b33/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code).
To achieve this, Carbon code must be able to call C++ functions naturally.
Several challenges arise:

- C++ supports function overloading, requiring Carbon to resolve calls to the
correct C++ function within an overload set.
- C++ types do not always have identical representations or ABIs to their
Carbon counterparts (see
[Carbon <-> C++ Interop: Primitive Types](https://github.com/carbon-language/carbon-lang/blob/44b2f60c90df5c1b0ce86f97bb0ece2a94eb50ea/proposals/p5448.md)).
For example, parameter passing conventions (by value, by pointer) or return
value handling (direct return versus return slot) might differ.
- C++ member functions require handling of the `this` pointer.
- C++ supports features like default arguments which need a defined mapping.

A clear, robust mechanism is needed to handle these complexities, ensuring both
correctness and performance while providing a good developer experience.

## Background

[Carbon's C++ interoperability philosophy](https://github.com/carbon-language/carbon-lang/blob/01e12111a8a685694ccd2c9deb2779f907917543/docs/design/interoperability/philosophy_and_goals.md)
aims to minimize bridge code and provide unsurprising mappings. When Carbon code
imports a C++ header, the functions declared within become potentially callable
entities. C++ overload resolution rules are complex, and replicating them
perfectly within Carbon would be difficult and likely divergent over time.
Furthermore, direct calls are only possible when the ABI conventions of the
Carbon call site precisely match the expectations of the C++ callee.

## Proposal

1. **Import:** C++ functions and methods, including overload sets, are imported
into Carbon and represented internally (conceptually, as specific overload
set instructions in SemIR).
2. **Overload Resolution:** When a call to an imported C++ function or overload
set occurs in Carbon, Carbon leverages Clang's overload resolution
mechanism. Carbon argument types are mapped to hypothetical C++ types /
expressions, and Clang's `Sema` determines the best viable function.
3. **ABI Bridging (Thunks):**
- If the selected C++ function's ABI (parameter types, return type
handling, calling convention) matches the Carbon call site's ABI based
on defined type mappings, a direct call is generated.
- If the ABIs mismatch, Carbon generates an intermediate function, called
a **C++ thunk**. This thunk has a "simple" ABI callable directly from
Carbon (typically using only pointers and basic integer types like
`i32`/`i64`). The thunk internally calls the actual C++ function,
performing necessary argument conversions (for example, loading a value
from a pointer) and handling return value conventions (for example,
managing a return slot).
4. **Call Execution:** The Carbon code either calls the C++ function directly
or calls the generated C++ thunk.

## Details

### Importing C++ Functions

When a C++ header is imported using `import Cpp`, declarations within that
header are made available. Function declarations, including member functions and
overloaded functions, are represented internally within Carbon's SemIR. An
overload set from C++ is represented as a single callable entity in Carbon,
associated with the set of C++ candidate functions.

### Overload Resolution

To resolve a call like `Cpp.MyNamespace.MyFunc(arg1, arg2)` where `MyFunc` might
be an overload set imported from C++:

1. **Map Arguments:** Carbon argument instructions (`arg1`, `arg2`) are mapped
to placeholder C++ expressions (conceptually similar to
[`clang::OpaqueValueExpr`](https://github.com/llvm/llvm-project/blob/1e99026b45b048a52f8372399ab83d488132842e/clang/include/clang/AST/Expr.h#L1178)).
The types of these expressions are determined by mapping the Carbon argument
types to corresponding C++ types
([Carbon <-> C++ Interop: Primitive Types](https://github.com/carbon-language/carbon-lang/blob/44b2f60c90df5c1b0ce86f97bb0ece2a94eb50ea/proposals/p5448.md)).
2. **Invoke Clang Sema:** Carbon invokes Clang's overload resolution logic
([`clang::OverloadCandidateSet::BestViableFunction()`](https://github.com/llvm/llvm-project/blob/1e99026b45b048a52f8372399ab83d488132842e/clang/include/clang/Sema/Overload.h#L1456))
with the mapped C++ name, the candidate functions from the imported overload
set, and the placeholder argument expressions.
3. **Select Candidate:** Clang determines the best viable C++ function based on
C++ rules (implicit conversions, template argument deduction if applicable
later, etc.). If resolution fails (no viable function, ambiguity), Clang's
diagnostics are surfaced as Carbon diagnostics.
4. **Access Check:** After selecting a function, Carbon checks if the function
is accessible based on C++ access specifiers (`public`, `protected`,
`private`) in the context of the call.

### Direct Calls versus Thunks

A direct call from Carbon to C++ is possible only if the ABI matches exactly. A
**C++ thunk** is required if:

- **Type Representation Mismatch:** A parameter or the return type has a
different representation in Carbon than expected by the C++ ABI, requiring
conversion. For example, a Carbon `bool` (`i1`) passed to a C++ `bool`
(often `i8`), or complex struct types.
- **Return Convention Mismatch:** The C++ function returns a non-trivial type
by value, which typically requires a hidden return slot parameter in the
ABI, whereas Carbon might expect a direct return value.
- **Parameter Convention Mismatch:** C++ expects a parameter by way of
pointer/reference where Carbon provides a value, or vice-versa.
- **Default Arguments:** The Carbon call omits arguments that have default
values in C++. The thunk provides the default values.
- **Variadic arguments:** (Future work) Calling
[C++ variadic arguments](https://en.cppreference.com/w/cpp/language/variadic_arguments.html)
functions.

If a thunk is _not_ required, Carbon emits a direct call instruction targeting
the mangled name of the C++ function.

### Thunk Generation

If a thunk is required for a C++ function `CppOriginalFunc()`, Carbon generates
a new internal function, conceptually `CppOriginalFunc__carbon_thunk()`:

1. **Signature:** The thunk has an ABI that is simple and directly callable
from Carbon.
- Parameters corresponding to C++ parameters with complex ABIs are passed
by pointer (`T*`).
- Parameters with simple ABIs (like `i32`, `i64`, raw pointers) are passed
directly.
- If `CppOriginalFunc` uses a return slot, the thunk takes a pointer
parameter for the return slot. Its LLVM return type becomes `void`.
- If `CppOriginalFunc` returns a simple type directly, the thunk returns
the same simple type directly.
2. **Body:** The thunk body performs the following:
- Loads values from pointer arguments passed by Carbon where necessary.
- Performs necessary type conversions between Carbon simple ABI types and
C++ expected types (for example, `i1` to `i8` for `bool`).
- Calls `CppOriginalFunc` with the converted arguments, potentially
passing the return slot address.
- If `CppOriginalFunc` returned directly, the thunk returns that value. If
it used a return slot, the thunk returns `void`.
3. **Attributes:** The thunk is typically marked `always_inline` to encourage
the optimizer to remove the indirection. It is given a predictable mangled
name based on the original function's mangled name plus a suffix.

The Carbon call site then calls the thunk instead of the original C++ function.

### Parameter and Return Value Handling

- **Arguments:** When calling a C++ function (directly or by way of a thunk),
Carbon arguments undergo implicit conversions as needed to match the
parameter types determined by overload resolution. For calls requiring a
thunk, additional conversions might occur at the call site (for example,
taking the address of an object to pass by pointer to the thunk) and within
the thunk (for example, loading the object from the pointer).
- **Return Values:** If the C++ function returns `void`, the Carbon call
expression has type `()`. If it returns a simple type directly, the Carbon
call has the corresponding mapped Carbon type. If the C++ function uses a
return slot, the Carbon call is modeled as initializing the storage
designated by the return slot argument (often a temporary created at the
call site), and the overall call expression typically results in the
initialized value.

### Member Function Calls

- **Instance Methods:** When `object.CppMethod()` is called, `object` becomes
the implicit `this` argument. Clang's overload resolution handles the
qualification (for example, `const`). The `this` pointer is passed as the
first argument, either directly or to the thunk.
- **Static Methods:** Calls like `CppClass::StaticMethod()` are treated like
free function calls; no `this` pointer is involved.

### Operator Calls

Calls to overloaded C++ operators are handled similarly to function calls.
Carbon identifies the operator call, looks up potential C++ operator functions
(both member and non-member), and uses Clang's overload resolution to select the
best candidate. Thunks may be generated if required by the selected operator
function's ABI.

## Rationale

- **Leverages Clang:** Reusing Clang's overload resolution avoids
reimplementing complex C++ rules and ensures consistency.
- **Performance:** Direct calls are used when possible. Thunks are designed to
be minimal and aggressively inlined, minimizing overhead.
- **Correctness:** Thunks handle ABI mismatches systematically, ensuring
correct data marshalling between Carbon and C++.
- **Developer Experience:** Aims for C++ calls to feel natural in Carbon,
hiding much of the complexity of ABI bridging.
- **Interop Goal:** Directly supports the core goal of seamless C++
interoperability.

## Alternatives considered

### Require Manual C++ Wrappers

Instead of generating thunks automatically, Carbon could require developers to
write C++ wrapper functions with simple C-like ABIs for any C++ function whose
ABI doesn't directly match Carbon's expectations.

- **Rejected because:** This places a significant burden on the developer,
increases boilerplate, hinders rapid iteration, and makes C++ libraries feel
less integrated. It violates the goal of minimizing bridge code.

### Mandate Carbon ABI Compatibility with C++

Carbon could define its types and calling conventions to always match a specific
C++ ABI (for example, Itanium).

- **Rejected because:** This would heavily constrain Carbon's own evolution
and design choices. It wouldn't solve the problem entirely, as C++ ABIs
themselves vary (for example, between platforms, compilers, or even
libraries like libc++ vs libstdc++ for `string_view`). It conflicts with the
goal of software and language evolution.
Loading