Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtrace through signal handlers on alpine #698

Open
r1viollet opened this issue Jan 24, 2025 · 13 comments
Open

Backtrace through signal handlers on alpine #698

r1viollet opened this issue Jan 24, 2025 · 13 comments

Comments

@r1viollet
Copy link

Description

While using the backtrace::resolve_frame_unsynchronized API it seems we are not going through signal handlers.
Here is what I get when using the API,

Starting backtrace-rs unwinding...
Frame: IP: 0x58951a886bc6, Function: backtrace::backtrace::libunwind::trace::hd9a0af93696308ae, File: "/root/.cargo/git/checkouts/backtrace-rs-fb1f822361417489/f8cc6ac/src/backtrace/libunwind.rs", Line: 116
Frame: IP: 0x58951a886bc6, Function: backtrace::backtrace::trace_unsynchronized::hac1bf9acb19bced3, File: "/root/.cargo/git/checkouts/backtrace-rs-fb1f822361417489/f8cc6ac/src/backtrace/mod.rs", Line: 66
Frame: IP: 0x58951a887eb6, Function: unwind_example::unwind_with_backtrace::h8ed6b6267026e92f, File: "/opt/libdatadog/examples/rust/src/main.rs", Line: 101
Frame: IP: 0x58951a888034, Function: unwind_example::crash_handler::hd3592dcebb8db214, File: "/opt/libdatadog/examples/rust/src/main.rs", Line: 145
Frame: IP: 0x754eafdcf5a4, Function: sigwaitinfo, File: "/home/buildozer/aports/main/musl/src/musl-1.2.5/src/signal/x86_64/restore.s", Line: 1

As you can see I get stuck on sigwaitinfo and to not go up to the main function.
I wrote a small comparison to libunwind where I successfully unwind through the signal handler and get to the main function.

The issue could be within the miri API. I did not dive into what was done within this function.

Reproducer

Here is the source code for my example. I ran this on an alpine image where I made an install of libunwind. It might be difficult to run. Please reach out if you want more precise instructions.

LD_PRELOAD=/usr/lib/libgcc_s.so ./target/debug/unwind_example

Thanks for considering 🙇

@workingjubilee
Copy link
Member

Please describe exactly how you built the code. Exactly which rustc, installed from where, using what build configuration?

@r1viollet
Copy link
Author

r1viollet commented Jan 27, 2025

Here is a dockerfile using a recent alpine.

ARG BASE_IMAGE="alpine:3.21.2"
FROM ${BASE_IMAGE} AS base

RUN apk update \
  && apk add --no-cache \
    build-base \
    cargo \
    cmake \
    curl \
    git \
    make \
    patchelf \
    protoc \
    pkgconf \
    unzip \
    bash \
  && mkdir /usr/local/src

# Install libunwind
RUN apk add automake autoconf libtool
RUN wget https://github.com/libunwind/libunwind/releases/download/v1.8.1/libunwind-1.8.1.tar.gz \
  && tar -xvf libunwind-1.8.1.tar.gz \
  && cd libunwind-1.8.1 \
  && autoreconf -i ./ \
  && ./configure CFLAGS="-g -O3" --disable-tests && make -j 8 \
make -j 8 \
  && make install

Then you can copy the example I mentioned above, build it. (nothing special here)

cargo build --target-dir target-alpine

And then I had to run it preloading libgcc_s. Somehow the build script was not linking libgcc_s.

LD_PRELOAD=/usr/lib/libgcc_s.so ./target-alpine/debug/unwind_example

If I understand well, both use libunwind. What version / distribution of libunwind is packaged into backtrace-rs?

@r1viollet
Copy link
Author

This was also visible on older alpine versions, using 0.3.74. The example uses 0.3.75, as can be seen in the example cargo files:

backtrace = { git = "https://github.com/rust-lang/backtrace-rs", tag = "0.3.75" }

@r1viollet
Copy link
Author

r1viollet commented Jan 27, 2025

I wonder if the difference could be induced by the fact that libunwind is pulled from ubuntu.
Whereas I recompile libunwind from alpine.

@bjorn3
Copy link
Member

bjorn3 commented Jan 27, 2025

What does ldd target-alpine/debug/unwind_example show? Does it link against libunwind or libgcc_s? The former should be used when producing a statically linked executable while the latter should be used when producing a dynamically linked executable, but I don't know how exactly Alpine patches rustc. Just that they patch it to do dynamic linking by default for the musl targets.

@r1viollet
Copy link
Author

So I had to split the examples as I had mixed C unwinding with libunwind along with backtrace-rs unwinding. New examples are here.

With the new setup, we have separate binaries that have backtrace unwinding on one side:

cargo build --bin unwind_backtrace --target-dir ./target-alpine-full --verbose
ldd ./target-alpine-full/debug/unwind_backtrace
	/lib/ld-musl-x86_64.so.1 (0x7d778c4fd000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x7d778c361000)
	libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7d778c4fd000)

libunwind is statically linked, but I still get a dynamic link to libgcc.

And I compile the C unwinding by dynamically linking to libunwind (I had to adjust the build script in the example above, un-commenting the lines that do the linking to libunwind):

cargo build --bin unwind_c --target-dir ./target-alpine-full --verbose
ldd ./target-alpine-full/debug/unwind_c
	/lib/ld-musl-x86_64.so.1 (0x71b106053000)
	libunwind.so.8 => /usr/local/lib/libunwind.so.8 (0x71b105fdc000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x71b105fb0000)
	libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x71b106053000)

As mentioned previously, using libunwind manually, I'm able to get to the main from the crash (even if the symbols are not great)

./target-alpine-full/debug/unwind_c
Running unwind_c example...
Crash detected! Unwinding stack...
Function: _ZN8unwind_c13crash_handler13crash_handler17hdf05f74edd9802c0E+0x5e
Function: <unknown>
Function: _ZN3std2rt10lang_start17h6d13f624c8f892faE+0x3a
Function: main+0x1e
Function: <unknown>
Function: <unknown>
Function: <unknown>
Function: <unknown>

@bjorn3
Copy link
Member

bjorn3 commented Jan 28, 2025

Why are you trying to link against libunwind manually when rustc already links your program against libgcc_s? libunwind and libgcc_s are both exporting the same _Unwind_* symbols. There is no guarantee that the dynamic linker will not mix symbols provided by both shared libraries, which would cause UB.

@r1viollet
Copy link
Author

So I think I confused you with my setup.

  • unwind_c: I do things manually to compare the unwinding, I link manually (and I do not need the backtrace-rs dependency).
  • unwind_backtrace: I do not do anything manually. I just pull the backtrace-rs dependency. This is the example that shows we do not unwind until the main. And I think the setup makes sense.

@bjorn3
Copy link
Member

bjorn3 commented Jan 28, 2025

At DataDog/libdatadog@474b67b#diff-ffb20d4b2b21cbbfd14872f5b864628343f71b89b55817525c08fad31ea3ae13R28-R29 you seem to try linking against both libunwind and libgcc_s. However I just saw that at DataDog/libdatadog@474b67b#diff-ffb20d4b2b21cbbfd14872f5b864628343f71b89b55817525c08fad31ea3ae13R21 you are reading CARGO_BIN_NAME which is not set of build scripts afaik. As such I don't get how you are linking to libunwind. The backtrace crate doesn't instruct rustc to link against libunwind either. The decision to link libgcc_s or libunwind is made by the standard library, which should only pick one or the other.

@r1viollet
Copy link
Author

r1viollet commented Jan 28, 2025

So the unwind_c was essentially to try and investigate the issue. You can remove the build.rs (and everything else that relates to the C reproducer) to focus on the example with backtrace-rs. I was counting on cargo to do the build magic (setting CARGO_BIN_NAME) however that did not happen, so I forced it manually. Here is my build command:

CARGO_BIN_NAME="unwind_c" cargo build --bin unwind_c --target-dir ./target-alpine-full --verbose

And for the backtrace example, we do not need the linker logics to apply:

cargo build --bin unwind_backtrace --target-dir ./target-alpine-full --verbose

libunwind requires libgcc_s from what I noticed building the C example.

@r1viollet
Copy link
Author

I think this issue explains the unwinding failure and has a good example using std functions.

@r1viollet
Copy link
Author

So the summary of the issue is that:

  • backtrace-rs does not explicitly use libunwind APIs (hence we do not have access to the step API).
    It would be hard to adjust the unwinding behaviour from backtrace-rs.
  • No CFI (call frame information) is provided to get through the signal handler on musl. Which causes the failure. The issue mentioned above is relevant to this.
  • Using libunwind to manually unwind, we can get through the signal handler. There is a cool article here about this. It mentions the Frame Pointer fallback.
    sidenote: colleagues mentioned that with libunwind, with the step function we can also use the context to force the unwinding from the signal's context.

Does all of this sound reasonable ?

@r1viollet
Copy link
Author

Using the context from the signal handler is indeed the correct solution. I get correct unwinding (obviously without the signal frames).

USE_CONTEXT=1 ./target-alpine-full/debug/unwind_c

I'll leave the example if this is of interest.
I do not know if you would be interested in exposing backtrace capabilities from within signals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants