Add a description
Release build aborts with "active exception in flight" on GCC + boost.context ≥ 1.88 (manage_exception_state vs userver's __cxa_get_globals interposition)
Summary
On a toolchain with boost.context ≥ 1.88.0 and libstdc++ (GCC), userver-core-unittest
(RelWithDebInfo) aborts non-deterministically with:
Unable to start coroutine engine with an active exception in flight
Root cause: boost.context 1.88 introduced detail::manage_exception_state, which
saves/restores *__cxa_get_globals() around every fiber resume()/resume_with()
(and, since ~1.91, also in ~fiber()). userver interposes __cxa_get_globals to keep
per-coroutine C++ exception state. The two mechanisms now both manage the same
__cxa_eh_globals, and when the current task context changes inside boost's
save/restore window, the thread's uncaughtExceptions counter underflows to -1, which
later trips the std::uncaught_exceptions() != 0 guard in engine::RunStandalone.
A secondary problem makes the existing escape hatch (USERVER_FEATURE_UBOOST_CORO=ON)
not actually work: engine/coro/marked_allocator.hpp includes <boost/coroutine2/...>
directly instead of going through the <coroutines/coroutine.hpp> abstraction, so even
with USERVER_FEATURE_UBOOST_CORO=ON the system boost headers (with the real
manage_exception_state) are compiled in.
Environment
- OS: Manjaro Linux (x86_64), kernel 6.18
- Compiler: GCC 16.1.1, libstdc++ 6.0.35
- boost: system 1.91.0 (reproduces on any boost.context ≥ 1.88.0)
- userver:
develop (release 3.0)
- Build type: RelWithDebInfo (release-only; Debug is unaffected)
Symptom
The guard in core/src/engine/run_standalone.cpp:
if (std::uncaught_exceptions() != 0) {
// We are probably inside a destructor, UINVARIANT would `std::terminate`.
utils::AbortWithStacktrace("Unable to start coroutine engine with an active exception in flight");
}
fires on a "random" test — typically the first RunStandalone after a heavy suite. Example:
[ RUN ] NWayLRU.Ctr
Unable to start coroutine engine with an active exception in flight. Stacktrace:
0# userver::utils::AbortWithStacktrace(std::basic_string_view<...>)
1# userver::engine::RunStandalone(unsigned long, ...)
2# userver::utest::impl::DoRunTest(...)
3# userver::NWayLRU_Ctr_Test::TestBody()
A single test per process passes; the failure only appears with accumulation, which is why
it looks flaky.
Root cause (evidence)
1. It is an underflow, not a leak
At the abort, on the main (non-coroutine) thread:
GetCurrentTaskContextUnchecked() == nullptr # genuine bare-thread tls_globals path
*(int*)(__cxa_get_globals() + offsetof(uncaughtExceptions)) == -1 # 0xffffffff
caughtExceptions == 0
So a __cxa_begin_catch decremented uncaughtExceptions without a matching __cxa_throw
increment on the same __cxa_eh_globals instance.
2. boost.context 1.88 manage_exception_state
/usr/include/boost/context/fiber_fcontext.hpp (libstdc++ branch):
class manage_exception_state {
public:
manage_exception_state() { exception_state_ = *__cxa_get_globals(); } // SAVE
~manage_exception_state() { *__cxa_get_globals() = exception_state_; } // RESTORE
private:
__cxa_eh_globals exception_state_;
};
used on every switch:
fiber resume() && {
detail::manage_exception_state exstate; // SAVE
return { detail::jump_fcontext( ... ) }; // switch fiber
} // RESTORE
This assumes __cxa_get_globals() is a thread-stable storage.
3. userver's interposition makes it non-stable
core/src/engine/task/cxxabi_eh_globals.cpp (USERVER_EHGLOBALS_INTERPOSE):
abi::__cxa_eh_globals* GetGlobals() throw() {
constinit thread_local EhGlobals tls_globals;
auto* globals = &tls_globals;
auto* context = current_task::GetCurrentTaskContextUnchecked();
if (context) globals = context->GetEhGlobals(); // <-- per-coroutine, changes with context
return reinterpret_cast<abi::__cxa_eh_globals*>(globals);
}
When the current task context changes between boost's SAVE and RESTORE (e.g. a re-entrant
destructor during coroutine teardown — CoroFunc even notes "dtors may want to schedule"),
the __cxa_throw (+1) and the __cxa_begin_catch (-1) of boost's forced_unwind land in
different __cxa_eh_globals instances, leaving the thread counter at -1.
gdb watchpoint on the main thread's uncaughtExceptions shows exactly this — a
__cxa_begin_catch from boost::context::detail::fiber_entry (fiber_fcontext.hpp:147)
taking it 0 -> -1, after a forced_unwind whose throw landed elsewhere.
4. Version matrix (verified by rebuilding userver-core-unittest against each)
manage_exception_state was introduced in boost.context 1.88.0 (absent in
1.74/1.83/1.86/1.87, present in 1.88/1.89/1.91).
| boost.context |
manage_exception_state |
Full userver-core-unittest (GCC, RelWithDebInfo) |
| ≤ 1.87 |
none |
✅ exit 0, 1889 passed, 0 abort/segv |
| 1.88.0 |
resume()/resume_with() only |
❌ deterministic abort at MutexDeathTest.SelfDeadlock (3/3) |
| 1.91 |
+ also ~fiber() |
❌ non-deterministic abort (~NWayLRU.Ctr) |
(≤1.87 was tested by feeding the build a fiber_fcontext.hpp with manage_exception_state
forced to the empty dummy struct — the only difference between 1.87 and 1.88.)
Minimal standalone reproduction
~70 lines, no userver, just boost.context ≥ 1.88 + a userver-style __cxa_get_globals
interposition. Prints a corrupted uncaughtExceptions when the "current context" changes
inside boost's manage_exception_state window:
#include <cxxabi.h>
#include <cstdio>
#include <cstring>
#include <utility>
#include <boost/context/fiber.hpp>
namespace ctx = boost::context;
struct EhGlobals { void* data[4] = {}; };
thread_local EhGlobals tls_globals;
thread_local EhGlobals* current_ctx = nullptr;
static EhGlobals* CurrentEh() { return current_ctx ? current_ctx : &tls_globals; }
extern "C" {
abi::__cxa_eh_globals* __cxa_get_globals() throw() { return reinterpret_cast<abi::__cxa_eh_globals*>(CurrentEh()); }
abi::__cxa_eh_globals* __cxa_get_globals_fast() throw() { return reinterpret_cast<abi::__cxa_eh_globals*>(CurrentEh()); }
}
static int Uncaught(const EhGlobals& g) { int v; std::memcpy(&v, (const char*)&g + 8, 4); return v; }
struct UnwindContextShift { EhGlobals* to; ~UnwindContextShift() { current_ctx = to; } };
int main() {
static EhGlobals coro_eh;
{
ctx::fiber f{[&](ctx::fiber&& m) {
UnwindContextShift guard{&coro_eh}; // flips current ctx during forced_unwind
current_ctx = nullptr;
m = std::move(m).resume();
return std::move(m);
}};
f = std::move(f).resume(); // run to first suspend
current_ctx = nullptr;
// fiber destroyed here -> manage_exception_state SAVE / forced_unwind / RESTORE
}
std::printf("tls.uncaught=%d coro.uncaught=%d => %s\n",
Uncaught(tls_globals), Uncaught(coro_eh),
(Uncaught(tls_globals)==0 && Uncaught(coro_eh)==0) ? "OK" : "CORRUPTED");
}
$ g++ -O2 -std=c++17 repro.cpp -o repro -lboost_context && ./repro
tls.uncaught=0 coro.uncaught=-1 => CORRUPTED # boost >= 1.88
# (OK on boost <= 1.87)
Why USERVER_FEATURE_UBOOST_CORO=ON does not fix it as-is
The vendored third_party/uboost_coro already neutralizes manage_exception_state
(uboost_coro/context/fiber_fcontext.hpp:67 is committed as #if 1 || ..., i.e. always the
dummy struct — commit 823a03770 "update boost ... to 1.88"). Good.
But core/src/engine/coro/marked_allocator.hpp bypasses the
<coroutines/coroutine.hpp> abstraction:
// core/src/engine/coro/marked_allocator.hpp
#include <boost/coroutine2/protected_fixedsize_stack.hpp> // <-- direct, not the abstraction
Under USERVER_FEATURE_UBOOST_CORO=ON, core/uboost_coro/include only provides
coroutines/coroutine.hpp (→ uboost_coro/coroutine2/...); there is no boost/-named shim.
So this direct <boost/coroutine2/...> falls through to system /usr/include/boost
(boost 1.91, real manage_exception_state). Since pool.hpp pulls the coroutine type via
marked_allocator.hpp, the entire coroutine2 template code is compiled against system boost,
and the abort persists. Confirmed by preprocessing:
$ g++ <uboost build flags> -E core/src/engine/coro/marked_allocator.hpp | grep fiber_fcontext
# 1 "/usr/include/boost/context/fiber_fcontext.hpp" # <-- system, not vendored
Proposed fix
Route marked_allocator.hpp through the coroutine abstraction so
USERVER_FEATURE_UBOOST_CORO=ON truly uses the vendored copy (both sys_coro and
uboost_coro variants of <coroutines/coroutine.hpp> already provide
boost::coroutines2::protected_fixedsize_stack):
// core/src/engine/coro/marked_allocator.hpp
-#include <boost/coroutine2/protected_fixedsize_stack.hpp>
+#include <coroutines/coroutine.hpp>
With this change + USERVER_FEATURE_UBOOST_CORO=ON, the full suite is green on GCC 16 /
boost 1.91:
[ PASSED ] 1889 tests (0 aborts, 0 segv; reproduced across multiple runs)
Notes / open questions
- This makes
USERVER_FEATURE_UBOOST_CORO=ON a reliable workaround for boost.context ≥ 1.88
on libstdc++. System-boost builds (USERVER_FEATURE_UBOOST_CORO=OFF) with boost ≥ 1.88
are still affected — for those, the real fix is to make userver's interposition
cooperate with (or stand down for) boost's manage_exception_state, or to require
USERVER_FEATURE_UBOOST_CORO=ON / boost ≤ 1.87.
- Removing the interposition entirely and relying solely on boost's
manage_exception_state
is not sufficient: it preserves the switching thread's globals but not each suspended
coroutine's own in-flight exception state, and segfaults on a normal coroutine resume
(pull_control_block_cc.ipp c = std::move(c).resume() from TaskContext::CoroFunc).
- clang + libc++ builds are unaffected (they use
USERVER_EHGLOBALS_SWAP, a different
mechanism).
Add a description
Release build aborts with "active exception in flight" on GCC + boost.context ≥ 1.88 (manage_exception_state vs userver's
__cxa_get_globalsinterposition)Summary
On a toolchain with boost.context ≥ 1.88.0 and libstdc++ (GCC),
userver-core-unittest(RelWithDebInfo) aborts non-deterministically with:
Root cause: boost.context 1.88 introduced
detail::manage_exception_state, whichsaves/restores
*__cxa_get_globals()around every fiberresume()/resume_with()(and, since ~1.91, also in
~fiber()). userver interposes__cxa_get_globalsto keepper-coroutine C++ exception state. The two mechanisms now both manage the same
__cxa_eh_globals, and when the current task context changes inside boost'ssave/restore window, the thread's
uncaughtExceptionscounter underflows to -1, whichlater trips the
std::uncaught_exceptions() != 0guard inengine::RunStandalone.A secondary problem makes the existing escape hatch (
USERVER_FEATURE_UBOOST_CORO=ON)not actually work:
engine/coro/marked_allocator.hppincludes<boost/coroutine2/...>directly instead of going through the
<coroutines/coroutine.hpp>abstraction, so evenwith
USERVER_FEATURE_UBOOST_CORO=ONthe system boost headers (with the realmanage_exception_state) are compiled in.Environment
develop(release 3.0)Symptom
The guard in
core/src/engine/run_standalone.cpp:fires on a "random" test — typically the first
RunStandaloneafter a heavy suite. Example:A single test per process passes; the failure only appears with accumulation, which is why
it looks flaky.
Root cause (evidence)
1. It is an underflow, not a leak
At the abort, on the main (non-coroutine) thread:
So a
__cxa_begin_catchdecrementeduncaughtExceptionswithout a matching__cxa_throwincrement on the same
__cxa_eh_globalsinstance.2. boost.context 1.88
manage_exception_state/usr/include/boost/context/fiber_fcontext.hpp(libstdc++ branch):used on every switch:
This assumes
__cxa_get_globals()is a thread-stable storage.3. userver's interposition makes it non-stable
core/src/engine/task/cxxabi_eh_globals.cpp(USERVER_EHGLOBALS_INTERPOSE):When the current task context changes between boost's SAVE and RESTORE (e.g. a re-entrant
destructor during coroutine teardown —
CoroFunceven notes "dtors may want to schedule"),the
__cxa_throw(+1) and the__cxa_begin_catch(-1) of boost'sforced_unwindland indifferent
__cxa_eh_globalsinstances, leaving the thread counter at -1.gdb watchpoint on the main thread's
uncaughtExceptionsshows exactly this — a__cxa_begin_catchfromboost::context::detail::fiber_entry(fiber_fcontext.hpp:147)taking it
0 -> -1, after aforced_unwindwhose throw landed elsewhere.4. Version matrix (verified by rebuilding userver-core-unittest against each)
manage_exception_statewas introduced in boost.context 1.88.0 (absent in1.74/1.83/1.86/1.87, present in 1.88/1.89/1.91).
manage_exception_stateuserver-core-unittest(GCC, RelWithDebInfo)resume()/resume_with()onlyMutexDeathTest.SelfDeadlock(3/3)~fiber()NWayLRU.Ctr)(
≤1.87was tested by feeding the build afiber_fcontext.hppwithmanage_exception_stateforced to the empty dummy struct — the only difference between 1.87 and 1.88.)
Minimal standalone reproduction
~70 lines, no userver, just boost.context ≥ 1.88 + a userver-style
__cxa_get_globalsinterposition. Prints a corrupted
uncaughtExceptionswhen the "current context" changesinside boost's
manage_exception_statewindow:Why
USERVER_FEATURE_UBOOST_CORO=ONdoes not fix it as-isThe vendored
third_party/uboost_coroalready neutralizesmanage_exception_state(
uboost_coro/context/fiber_fcontext.hpp:67is committed as#if 1 || ..., i.e. always thedummy struct — commit
823a03770"update boost ... to 1.88"). Good.But
core/src/engine/coro/marked_allocator.hppbypasses the<coroutines/coroutine.hpp>abstraction:Under
USERVER_FEATURE_UBOOST_CORO=ON,core/uboost_coro/includeonly providescoroutines/coroutine.hpp(→uboost_coro/coroutine2/...); there is noboost/-named shim.So this direct
<boost/coroutine2/...>falls through to system/usr/include/boost(boost 1.91, real
manage_exception_state). Sincepool.hpppulls the coroutine type viamarked_allocator.hpp, the entire coroutine2 template code is compiled against system boost,and the abort persists. Confirmed by preprocessing:
Proposed fix
Route
marked_allocator.hppthrough the coroutine abstraction soUSERVER_FEATURE_UBOOST_CORO=ONtruly uses the vendored copy (bothsys_coroanduboost_corovariants of<coroutines/coroutine.hpp>already provideboost::coroutines2::protected_fixedsize_stack):With this change +
USERVER_FEATURE_UBOOST_CORO=ON, the full suite is green on GCC 16 /boost 1.91:
Notes / open questions
USERVER_FEATURE_UBOOST_CORO=ONa reliable workaround for boost.context ≥ 1.88on libstdc++. System-boost builds (
USERVER_FEATURE_UBOOST_CORO=OFF) with boost ≥ 1.88are still affected — for those, the real fix is to make userver's interposition
cooperate with (or stand down for) boost's
manage_exception_state, or to requireUSERVER_FEATURE_UBOOST_CORO=ON/ boost ≤ 1.87.manage_exception_stateis not sufficient: it preserves the switching thread's globals but not each suspended
coroutine's own in-flight exception state, and segfaults on a normal coroutine resume
(
pull_control_block_cc.ippc = std::move(c).resume()fromTaskContext::CoroFunc).USERVER_EHGLOBALS_SWAP, a differentmechanism).