Skip to content

Commit 41e2909

Browse files
committed
Add clang / LLVM implementation details + address review comments
1 parent ff3a3df commit 41e2909

File tree

1 file changed

+53
-15
lines changed

1 file changed

+53
-15
lines changed

clang/docs/DebuggingCoroutines.rst

Lines changed: 53 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Coroutines are generally used either as generators or for asynchronous
1717
programming. In this document, we will discuss both use cases. Even if you are
1818
using coroutines for asynchronous programming, you should still read the
1919
generators section, as it will introduce foundational debugging techniques also
20-
applicable to the debugging of asynchronous programming.
20+
applicable to the debugging of asynchronous programs.
2121

2222
Both compilers (clang, gcc, ...) and debuggers (lldb, gdb, ...) are
2323
still improving their support for coroutines. As such, we recommend using the
@@ -42,11 +42,11 @@ earlier.
4242
Debugging generators
4343
====================
4444

45-
The first major use case for coroutines in C++ are generators, i.e., functions
46-
which can produce values via ``co_yield``. Values are produced lazily,
47-
on-demand. For that purpose, every time a new value is requested the coroutine
48-
gets resumed. As soon as it reaches a ``co_yield`` and thereby returns the
49-
requested value, the coroutine is suspended again.
45+
One of the two major use cases for coroutines in C++ are generators, i.e.,
46+
functions which can produce values via ``co_yield``. Values are produced
47+
lazily, on-demand. For that purpose, every time a new value is requested the
48+
coroutine gets resumed. As soon as it reaches a ``co_yield`` and thereby
49+
returns the requested value, the coroutine is suspended again.
5050

5151
This logic is encapsulated in a ``generator`` type similar to this one:
5252

@@ -589,6 +589,42 @@ the promise as follows:
589589
.. code-block::
590590
print (task::promise_type)*(0x416eb0+16)
591591
592+
Implementation in clang / LLVM
593+
------------------------------
594+
595+
The C++ Coroutines feature in the Clang compiler is implemented in two parts of
596+
the compiler. Semantic analysis is performed in Clang, and Coroutine
597+
construction and optimization takes place in the LLVM middle-end.
598+
599+
For each coroutine function, the frontend generates a single corresponding
600+
LLVM-IR function. This function uses special ``llvm.coro.suspend`` intrinsics
601+
to mark the suspension points of the coroutine. The middle end first optimizes
602+
this function and applies, e.g., constant propagation across the whole,
603+
non-split coroutine.
604+
605+
CoroSplit then splits the function into ramp, resume and destroy functions.
606+
This pass also moves stack-local variables which are alive across suspension
607+
points into the coroutine frame. Most of the heavy lifting to preserve debugging
608+
information is done in this pass. This pass needs to rewrite all variable
609+
locations to point into the coroutine frame.
610+
611+
Afterwards, a couple of additional optimizations are applied, before code
612+
gets emitted, but none of them are really interesting regarding debugging
613+
information.
614+
615+
For more details on the IR representation of coroutines and the relevant
616+
optimization passes, see `Coroutines in LLVM <https://llvm.org/docs/Coroutines.html>`_.
617+
618+
Emitting debug information inside ``CoroSplit`` forces us to generate
619+
insufficient debugging information. Usually, the compiler generates debug
620+
information in the frontend, as debug information is highly language specific.
621+
However, this is not possible for coroutine frames because the frames are
622+
constructed in the LLVM middle-end.
623+
624+
To mitigate this problem, the LLVM middle end attempts to generate some debug
625+
information, which is unfortunately incomplete, since much of the language
626+
specific information is missing in the middle end.
627+
592628
Devirtualization of coroutine handles
593629
-------------------------------------
594630

@@ -650,11 +686,7 @@ clang / LLVM usually use variables like ``__int_32_0`` to represent this
650686
optimized storage. Those values usually do not directly correspond to variables
651687
in the source code.
652688

653-
For example, when compiling the following program, the compiler creates a
654-
single entry ``__int_32_0`` in the coroutine state. Intuitively, one might
655-
assume that ``__int_32_0`` represents the value of the local variable ``a``.
656-
However, inspecting ``__int_32_0`` in the debugger while single-stepping will
657-
show the following values:
689+
When compiling the program
658690

659691
.. code-block:: c++
660692

@@ -674,10 +706,16 @@ show the following values:
674706
std::cout << a << "\n";
675707
}
676708

677-
The value of ``__int_32_0`` seemingly does not change, despite being frequently
678-
incremented. While this might be surprising, this is a result of the optimizer
679-
recognizing that it can eliminate most of the load/store operations. The above
680-
code gets optimized to the equivalent of:
709+
clang creates a single entry ``__int_32_0`` in the coroutine state.
710+
711+
Intuitively, one might assume that ``__int_32_0`` represents the value of the
712+
local variable ``a``. However, inspecting ``__int_32_0`` in the debugger while
713+
single-stepping will reveal that the value of ``__int_32_0`` stays constant,
714+
despite ``a`` being frequently incremented.
715+
716+
While this might be surprising, this is a result of the optimizer recognizing
717+
that it can eliminate most of the load/store operations.
718+
The above code gets optimized to the equivalent of:
681719

682720
.. code-block:: c++
683721

0 commit comments

Comments
 (0)