Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LTO build of rpcs3 fails #483

Open
oltolm opened this issue Feb 26, 2025 · 4 comments
Open

LTO build of rpcs3 fails #483

oltolm opened this issue Feb 26, 2025 · 4 comments

Comments

@oltolm
Copy link

oltolm commented Feb 26, 2025

Hello I use clang 19.1.7 from MSYS2 clang64. Recently https://github.com/RPCS3/rpcs3 has enabled LTO. Now the release build is failing with

[build] ld.lld: error: undefined symbol: thread-local initialization routine for perf_stat<19226358023673171ull>::g_tls_perf_stat
[build] >>> referenced by C:/src/rpcs3/rpcs3/Emu/CPU/CPUThread.cpp
[build] >>>               librpcs3_emu.a(CPUThread.cpp.obj)
[build] >>> referenced by C:/src/rpcs3/rpcs3/Emu/Memory/vm.cpp
[build] >>>               librpcs3_emu.a(vm.cpp.obj)
[build] 
@mstorsjo
Copy link
Owner

Hello I use clang 19.1.7 from MSYS2 clang64

Technically, this repo is for the llvm-mingw packaging, and for upstream Clang issues the relevant place is https://github.com/llvm/llvm-project/issues - but I guess the issue can be reproduced with plain llvm-mingw too.

In any case - in order to be able to dig further into it, can the issue be reduced down to a minimal selfcontained testcase?

@oltolm
Copy link
Author

oltolm commented Mar 3, 2025

I reduced it

CMakeLists.txt:

cmake_minimum_required(VERSION 3.21)
project(lto VERSION 0.0.1 LANGUAGES C CXX)

add_executable(myexe src/myexe.cpp)
target_compile_features(myexe PRIVATE cxx_std_20)
target_link_options(myexe PRIVATE -municode)

set_property(TARGET myexe PROPERTY INTERPROCEDURAL_OPTIMIZATION TRUE)

myexe.cpp:

#include "perf_meter.hpp"

int wmain(int argc, wchar_t** argv)
{
    perf_meter<"DMA"_u32> perf_;
    return 0;
}

perf_meter.hpp:

template <auto ShortName> class perf_stat final {
    static inline thread_local struct perf_stat_local {
        // Local non-atomic values for increments
        unsigned m_log[66]{};

        perf_stat_local() noexcept {}

        ~perf_stat_local() {}

    } g_tls_perf_stat;

  public:
    static void push(unsigned start_time) noexcept { (void)g_tls_perf_stat; }
};

template <auto ShortName, auto... SubEvents> class perf_meter {
  public:
    ~perf_meter() { perf_stat<ShortName>::push(0); }
};

constexpr unsigned operator""_u32(const char* s, unsigned long long /*length*/)
{
    return 0;
}

@mstorsjo
Copy link
Owner

mstorsjo commented Mar 9, 2025

Thanks; I've reproduced the issue.

I've further reduced the issue down to this:

struct tlsdata {
    int val;
    ~tlsdata() {}
};
inline thread_local struct tlsdata tlsvar;

int main(int argc, char** argv)
{
    (void)tlsvar;
    return 0;
}

It can be compiled with one single command without needing cmake, just x86_64-w64-mingw32-clang++ myexe.cpp -flto.

The inline on the tlsvar definition is essential - without it, this works fine.

@mstorsjo
Copy link
Owner

mstorsjo commented Mar 9, 2025

I understand what's happening now, but I don't yet know what to do about it.

CC @cjacek who has some experience with the LLD internals.

When doing LTO linking, we load the LTO objects which essentially are LLVM IR. Based on the symbols in the IR, we populate the LLD symbol table (while we don't actually have COFF object files for the symbols yet) and do linking just like regular, then after we replace the fake symbols with the actual ones output from the LTO compilation. (This step is very tricky for cases when the compilation output doesn't contain the exact same symbols as the LLVM IR did, e.g. if the compiler adds references to symbols that weren't visible on the LLVM IR level.)

In this case, we end up here in the linker: https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/lld/COFF/InputFiles.cpp#L1338-L1343 We try to set up a weak alias towards the target symbol.

However in this case, we have the following situation in the IR from the compiled object (x86_64-w64-mingw32-clang++ myexe.cpp -flto -S -o -):

@_ZTH6tlsvar = linkonce_odr dso_local alias void (), ptr @__tls_init
define internal void @__tls_init() #4 {
...
}

We have _ZTH6tlsvar as a weak alias, pointing at __tls_init. However the unusual thing is that __tls_init isn't an external symbol. So when we try to look up __tls_init in the symbol table, we find nothing. And thus, as _ZTH6tlsvar is a weak alias, with the function it points at unresolvable, we also end up with having _ZTH6tlsvar essentially unresolved.

When such things are compiled to a regular object file before linking, the LLVM MC layer generates a regular extern symbol for the target of the weak alias, named e.g. .weak._ZTH6tlsvar.default.main in this case.

I'm not entirely sure what the right fix is here though. Perhaps the code in https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/lld/COFF/InputFiles.cpp#L1338-L1343 would need to look at the target symbol, and see that if it isn't extern, we'd need to treat the alias differently, somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants