Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] __cxa_get_globals() does not use thread_local and __cxa_thread_atexit() is not called #1200

Closed
pharscoet opened this issue Mar 3, 2020 · 5 comments
Assignees
Labels
Milestone

Comments

@pharscoet
Copy link

pharscoet commented Mar 3, 2020

I am having this issue with a dynamically loaded library that uses the static STL (libc++_static.a).

When code from another library calls code in the loaded library that in turn calls the STL __cxa_get_globals() function (like when throwing an exception), and the loaded library and its STL is unloaded before the thread exits, the process crashes.

Here is code to reproduce this issue:
testlib.cpp

extern "C" void func() {
    try {
        throw 0;
    } catch (...) {
    }
}

test.cpp

#include <dlfcn.h>
#include <thread>

void myThread() {
    void* lib = dlopen("./libtestlib.so", RTLD_LAZY);
    auto func = reinterpret_cast<void (*)()>(dlsym(lib, "func"));
    func();
    dlclose(lib);
}

int main(int, char**) {
    std::thread t(myThread);
    t.join();
}

and the crash trace:

03-02 17:40:06.280 32676 32676 F DEBUG   : backtrace:
03-02 17:40:06.280 32676 32676 F DEBUG   :       #00 pc 0000007a4fd19f04  <unknown>
03-02 17:40:06.280 32676 32676 F DEBUG   :       #01 pc 00000000000e75e4  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_key_clean_all()+124) (BuildId: 55ce0a7d78144b0290f9746ed1615719)
03-02 17:40:06.281 32676 32676 F DEBUG   :       #02 pc 00000000000e7038  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_exit+72) (BuildId: 55ce0a7d78144b0290f9746ed1615719)
03-02 17:40:06.281 32676 32676 F DEBUG   :       #03 pc 00000000000e6f14  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+40) (BuildId: 55ce0a7d78144b0290f9746ed1615719)
03-02 17:40:06.281 32676 32676 F DEBUG   :       #04 pc 00000000000850c8  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 55ce0a7d78144b0290f9746ed1615719)

I tracked down the issue to the implementation of __cxa_get_globals() in cxa_exception_storage.cpp. The STL is compiled with HAS_THREAD_LOCAL undefined (shouldn't it be defined?) and the fallback implementation for __cxa_get_globals() that is used in that case installs a pthread key with a destructor directly and without calling __cxa_thread_atexit(). Since it doesn't, the library gets unloaded before the thread exits with the code above and when the thread does exit, the now gone pthread key destructor is then called and the crash happens.

Recompiling the STL with HAS_THREAD_LOCAL defined uses the thread_local implementation of __cxa_get_globals() and thus __cxa_thread_atexit() is called behind the scenes and the crash does not happen and the library is unloaded after the thread exits.

Environment Details

  • NDK Version: 21.0.6113669
  • Build system: ndk-build
  • Host OS: Linux
  • ABI: arm64-v8a (also reproduced on x86)
  • NDK API level: 28
  • Device API level: 28
@pharscoet pharscoet added the bug label Mar 3, 2020
@DanAlbert
Copy link
Member

The STL is compiled with HAS_THREAD_LOCAL undefined (shouldn't it be defined?)

My first guess was that this was because bionic doesn't have __cxa_thread_atexit_impl until newer releases and that's was the reason for this, but it appears to literally mean "do we support the thread_local keyword? That should probably be a __has_feature check...

I'm building/testing with that flag flipped right now. It looks like you're correct and it should Just Work.

This might also be somewhat related to #360 (comment), in which case there will be at least a corner of this bug that's unfixable for old releases.

Thanks for the detailed report and investigation, btw! Assuming this fix does work, I'll get your test case added to our regression suite.

@DanAlbert DanAlbert added this to the r22 milestone Apr 9, 2020
@DanAlbert DanAlbert self-assigned this Apr 9, 2020
@pharscoet
Copy link
Author

Thanks @DanAlbert for looking into it and adding to the r22 milestone.

Issue #360 does look related.

Since newer bionic releases delay the library unloading while there are pending thread_local destructors, the issue does not happen if the STL itself uses thread_local for __cxa_get_globals(). Older releases would still have the problem yes.

@DanAlbert
Copy link
Member

I did get as far as running all our existing tests with that flag set and encountered no issues. Still need to verify your test case though.

@DanAlbert
Copy link
Member

Confirmed that your test case fails with r21 (not that I didn't believe you, just wanted to make sure I'd integrated it into our test suite right!) and that it passes with the change you suggested.

https://android-review.googlesource.com/c/platform/ndk/+/1285595
https://android-review.googlesource.com/c/toolchain/llvm-project/+/1285596

Thanks again for making this easy on us :)

@pharscoet
Copy link
Author

Glad I could contribute something useful :-)

Benau added a commit to Benau/stk-code that referenced this issue Oct 4, 2021
Benau added a commit to Benau/stk-code that referenced this issue Oct 4, 2021
smeenai added a commit to llvm/llvm-project that referenced this issue Nov 22, 2022
This was previously guarded by HAS_THREAD_LOCAL, which was never set by
CMake and had to be specified manually. Android has been setting this to
solve android/ndk#1200 [1], but every compiler
and platform libc++abi supports should have thread_local by now, so we
can just get rid of the fallback implementation and simplify things
significantly (including removing the now unused fallback calloc).

[1] https://android-review.googlesource.com/c/toolchain/llvm-project/+/1285596

Reviewed By: #libc_abi, MaskRay, ldionne

Differential Revision: https://reviews.llvm.org/D138461
kongy pushed a commit to kongy/llvm_android that referenced this issue Jun 2, 2023
Bug: android/ndk#1200
Test: None
Change-Id: Ib37c85f2e99a1869206a0dcb6542dc0f386bba73
ldionne added a commit to ldionne/llvm-project that referenced this issue Jan 29, 2024
This was previously guarded by HAS_THREAD_LOCAL, which was never set by
CMake and had to be specified manually. Android has been setting this to
solve android/ndk#1200 [1], but every compiler
and platform libc++abi supports should have thread_local by now, so we
can just get rid of the fallback implementation and simplify things
significantly (including removing the now unused fallback calloc).

This is a re-application of https://reviews.llvm.org/D138461.
Fixes llvm#78207.

[1]: https://android-review.googlesource.com/c/toolchain/llvm-project/+/1285596

Co-Authored-By: Shoaib Meenai <smeenai@fb.com>
ldionne added a commit to ldionne/llvm-project that referenced this issue Jan 30, 2024
This was previously guarded by HAS_THREAD_LOCAL, which was never set by
CMake and had to be specified manually. Android has been setting this to
solve android/ndk#1200 [1], but every compiler
and platform libc++abi supports should have thread_local by now, so we
can just get rid of the fallback implementation and simplify things
significantly (including removing the now unused fallback calloc).

This is a re-application of https://reviews.llvm.org/D138461.
Fixes llvm#78207.

[1]: https://android-review.googlesource.com/c/toolchain/llvm-project/+/1285596

Co-Authored-By: Shoaib Meenai <smeenai@fb.com>
ldionne added a commit to ldionne/llvm-project that referenced this issue Feb 5, 2024
This was previously guarded by HAS_THREAD_LOCAL, which was never set by
CMake and had to be specified manually. Android has been setting this to
solve android/ndk#1200 [1], but every compiler
and platform libc++abi supports should have thread_local by now, so we
can just get rid of the fallback implementation and simplify things
significantly (including removing the now unused fallback calloc).

This is a re-application of https://reviews.llvm.org/D138461.
Fixes llvm#78207.

[1]: https://android-review.googlesource.com/c/toolchain/llvm-project/+/1285596

Co-Authored-By: Shoaib Meenai <smeenai@fb.com>
ldionne added a commit to ldionne/llvm-project that referenced this issue Jun 21, 2024
This was previously guarded by HAS_THREAD_LOCAL, which was never set by
CMake and had to be specified manually. Android has been setting this to
solve android/ndk#1200 [1], but every compiler
and platform libc++abi supports should have thread_local by now, so we
can just get rid of the fallback implementation and simplify things
significantly (including removing the now unused fallback calloc).

This is a re-application of https://reviews.llvm.org/D138461.
Fixes llvm#78207.

[1]: https://android-review.googlesource.com/c/toolchain/llvm-project/+/1285596

Co-Authored-By: Shoaib Meenai <smeenai@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 participants