Skip to content

Commit 05931f3

Browse files
committed
Add clang / LLVM implementation details + address review comments
1 parent 4d0cd52 commit 05931f3

File tree

1 file changed

+53
-15
lines changed

1 file changed

+53
-15
lines changed

clang/docs/DebuggingCoroutines.rst

Lines changed: 53 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Coroutines are generally used either as generators or for asynchronous
1717
programming. In this document, we will discuss both use cases. Even if you are
1818
using coroutines for asynchronous programming, you should still read the
1919
generators section, as it will introduce foundational debugging techniques also
20-
applicable to the debugging of asynchronous programming.
20+
applicable to the debugging of asynchronous programs.
2121

2222
Both compilers (clang, gcc, ...) and debuggers (lldb, gdb, ...) are
2323
still improving their support for coroutines. As such, we recommend using the
@@ -42,11 +42,11 @@ earlier.
4242
Debugging generators
4343
====================
4444

45-
The first major use case for coroutines in C++ are generators, i.e., functions
46-
which can produce values via ``co_yield``. Values are produced lazily,
47-
on-demand. For that purpose, every time a new value is requested the coroutine
48-
gets resumed. As soon as it reaches a ``co_yield`` and thereby returns the
49-
requested value, the coroutine is suspended again.
45+
One of the two major use cases for coroutines in C++ are generators, i.e.,
46+
functions which can produce values via ``co_yield``. Values are produced
47+
lazily, on-demand. For that purpose, every time a new value is requested the
48+
coroutine gets resumed. As soon as it reaches a ``co_yield`` and thereby
49+
returns the requested value, the coroutine is suspended again.
5050

5151
This logic is encapsulated in a ``generator`` type similar to this one:
5252

@@ -590,6 +590,42 @@ the promise as follows:
590590
591591
print (task::promise_type)*(0x416eb0+16)
592592
593+
Implementation in clang / LLVM
594+
------------------------------
595+
596+
The C++ Coroutines feature in the Clang compiler is implemented in two parts of
597+
the compiler. Semantic analysis is performed in Clang, and Coroutine
598+
construction and optimization takes place in the LLVM middle-end.
599+
600+
For each coroutine function, the frontend generates a single corresponding
601+
LLVM-IR function. This function uses special ``llvm.coro.suspend`` intrinsics
602+
to mark the suspension points of the coroutine. The middle end first optimizes
603+
this function and applies, e.g., constant propagation across the whole,
604+
non-split coroutine.
605+
606+
CoroSplit then splits the function into ramp, resume and destroy functions.
607+
This pass also moves stack-local variables which are alive across suspension
608+
points into the coroutine frame. Most of the heavy lifting to preserve debugging
609+
information is done in this pass. This pass needs to rewrite all variable
610+
locations to point into the coroutine frame.
611+
612+
Afterwards, a couple of additional optimizations are applied, before code
613+
gets emitted, but none of them are really interesting regarding debugging
614+
information.
615+
616+
For more details on the IR representation of coroutines and the relevant
617+
optimization passes, see `Coroutines in LLVM <https://llvm.org/docs/Coroutines.html>`_.
618+
619+
Emitting debug information inside ``CoroSplit`` forces us to generate
620+
insufficient debugging information. Usually, the compiler generates debug
621+
information in the frontend, as debug information is highly language specific.
622+
However, this is not possible for coroutine frames because the frames are
623+
constructed in the LLVM middle-end.
624+
625+
To mitigate this problem, the LLVM middle end attempts to generate some debug
626+
information, which is unfortunately incomplete, since much of the language
627+
specific information is missing in the middle end.
628+
593629
Devirtualization of coroutine handles
594630
-------------------------------------
595631

@@ -651,11 +687,7 @@ clang / LLVM usually use variables like ``__int_32_0`` to represent this
651687
optimized storage. Those values usually do not directly correspond to variables
652688
in the source code.
653689

654-
For example, when compiling the following program, the compiler creates a
655-
single entry ``__int_32_0`` in the coroutine state. Intuitively, one might
656-
assume that ``__int_32_0`` represents the value of the local variable ``a``.
657-
However, inspecting ``__int_32_0`` in the debugger while single-stepping will
658-
show the following values:
690+
When compiling the program
659691

660692
.. code-block:: c++
661693

@@ -675,10 +707,16 @@ show the following values:
675707
std::cout << a << "\n";
676708
}
677709

678-
The value of ``__int_32_0`` seemingly does not change, despite being frequently
679-
incremented. While this might be surprising, this is a result of the optimizer
680-
recognizing that it can eliminate most of the load/store operations. The above
681-
code gets optimized to the equivalent of:
710+
clang creates a single entry ``__int_32_0`` in the coroutine state.
711+
712+
Intuitively, one might assume that ``__int_32_0`` represents the value of the
713+
local variable ``a``. However, inspecting ``__int_32_0`` in the debugger while
714+
single-stepping will reveal that the value of ``__int_32_0`` stays constant,
715+
despite ``a`` being frequently incremented.
716+
717+
While this might be surprising, this is a result of the optimizer recognizing
718+
that it can eliminate most of the load/store operations.
719+
The above code gets optimized to the equivalent of:
682720

683721
.. code-block:: c++
684722

0 commit comments

Comments
 (0)