Skip to content

BNBeginUndoActions can cause a deadlock #6289

Open
@WeiN76LQh

Description

@WeiN76LQh

Version and Platform (required):

  • Binary Ninja Version: 4.3.6617-dev (28632797)
  • OS: macOS
  • OS Version: 15.1.1
  • CPU Architecture: M1

Bug Description:
BNBeginUndoActions can cause a deadlock by what appears to be, from the stack trace below, due to waiting on a condition variable:

__psynch_cvwait (@__psynch_cvwait:5)
_pthread_cond_wait (@_pthread_cond_wait:304)
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) (@std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&):10)
___lldb_unnamed_symbol22733 (@___lldb_unnamed_symbol22733:19)
___lldb_unnamed_symbol22736 (@___lldb_unnamed_symbol22736:102)
___lldb_unnamed_symbol14121 (@___lldb_unnamed_symbol14121:37)
BNBeginUndoActions (@BNBeginUndoActions:11)
BinaryNinja::FileMetadata::BeginUndoActions(bool)
BinaryNinja::BinaryView::BeginUndoActions(bool)
SharedCacheCore::SharedCache::FindSymbolAtAddrAndApplyToAddr(unsigned long long, unsigned long long, bool)
::BNDSCFindSymbolAtAddressAndApplyToAddress(BNSharedCache *, uint64_t, uint64_t, bool)
SharedCacheAPI::SharedCache::FindSymbolAtAddrAndApplyToAddr(unsigned long long, unsigned long long, bool) const
SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0::operator()() const
decltype(std::declval<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&>()()) std::__1::__invoke[abi:ne180100]<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&>(SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&)
void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ne180100]<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&>(SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&)
std::__1::__function::__alloc_func<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0, std::__1::allocator<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0>, void ()>::operator()[abi:ne180100]()
std::__1::__function::__func<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0, std::__1::allocator<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0>, void ()>::operator()()
std::__1::__function::__value_func<void ()>::operator()[abi:ne180100]() const
std::__1::function<void ()>::operator()() const
WorkerActionCallback(void*)

There are other threads stuck in BNForgetUndoActions:

__psynch_cvwait (@__psynch_cvwait:5)
_pthread_cond_wait (@_pthread_cond_wait:304)
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) (@std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&):10)
___lldb_unnamed_symbol22733 (@___lldb_unnamed_symbol22733:19)
___lldb_unnamed_symbol22736 (@___lldb_unnamed_symbol22736:102)
___lldb_unnamed_symbol14125 (@___lldb_unnamed_symbol14125:73)
BNForgetUndoActions (@BNForgetUndoActions:38)
BinaryNinja::FileMetadata::ForgetUndoActions(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)
BinaryNinja::BinaryView::ForgetUndoActions(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)
SharedCacheCore::SharedCache::FindSymbolAtAddrAndApplyToAddr(unsigned long long, unsigned long long, bool)
::BNDSCFindSymbolAtAddressAndApplyToAddress(BNSharedCache *, uint64_t, uint64_t, bool)
SharedCacheAPI::SharedCache::FindSymbolAtAddrAndApplyToAddr(unsigned long long, unsigned long long, bool) const
SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0::operator()() const
decltype(std::declval<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&>()()) std::__1::__invoke[abi:ne180100]<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&>(SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&)
void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ne180100]<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&>(SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0&)
std::__1::__function::__alloc_func<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0, std::__1::allocator<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0>, void ()>::operator()[abi:ne180100]()
std::__1::__function::__func<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0, std::__1::allocator<SharedCacheWorkflow::ProcessOffImageCall(BinaryNinja::Ref<BinaryNinja::AnalysisContext>, BinaryNinja::Ref<BinaryNinja::Function>, BinaryNinja::Ref<BinaryNinja::MediumLevelILFunction>, BinaryNinja::MediumLevelILInstruction, unsigned long, bool)::$_0>, void ()>::operator()()
std::__1::__function::__value_func<void ()>::operator()[abi:ne180100]() const
std::__1::function<void ()>::operator()() const
WorkerActionCallback(void*)

Another thread (with the name "Worker T module:core.module.loadDebugInfo module:core.module...") is stuck waiting here:

__psynch_cvwait (@__psynch_cvwait:5)
_pthread_cond_wait (@_pthread_cond_wait:304)
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) (@std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&):10)
___lldb_unnamed_symbol22733 (@___lldb_unnamed_symbol22733:19)
___lldb_unnamed_symbol22736 (@___lldb_unnamed_symbol22736:102)
___lldb_unnamed_symbol14143 (@___lldb_unnamed_symbol14143:26)
BNGetFileViewOfType (@BNGetFileViewOfType:39)
___lldb_unnamed_symbol913 (@___lldb_unnamed_symbol913:19)
___lldb_unnamed_symbol920 (@___lldb_unnamed_symbol920:11)
___lldb_unnamed_symbol924 (@___lldb_unnamed_symbol924:13)
___lldb_unnamed_symbol420 (@___lldb_unnamed_symbol420:10)
___lldb_unnamed_symbol723 (@___lldb_unnamed_symbol723:10)
___lldb_unnamed_symbol13411 (@___lldb_unnamed_symbol13411:29)
___lldb_unnamed_symbol13401 (@___lldb_unnamed_symbol13401:34)
___lldb_unnamed_symbol7047 (@___lldb_unnamed_symbol7047:27)
___lldb_unnamed_symbol7014 (@___lldb_unnamed_symbol7014:81)
___lldb_unnamed_symbol7628 (@___lldb_unnamed_symbol7628:15)
___lldb_unnamed_symbol6639 (@___lldb_unnamed_symbol6639:20)
___lldb_unnamed_symbol30229 (@___lldb_unnamed_symbol30229:723)
___lldb_unnamed_symbol30252 (@___lldb_unnamed_symbol30252:148)
___lldb_unnamed_symbol30251 (@___lldb_unnamed_symbol30251:10)
___lldb_unnamed_symbol30472 (@___lldb_unnamed_symbol30472:18)
___lldb_unnamed_symbol27455 (@___lldb_unnamed_symbol27455:19)
___lldb_unnamed_symbol29856 (@___lldb_unnamed_symbol29856:734)
___lldb_unnamed_symbol29851 (@___lldb_unnamed_symbol29851:6)
_pthread_start (@_pthread_start:37)

2 threads called "Thread (pooled)" are stuck here:

__psynch_cvwait (@__psynch_cvwait:5)
_pthread_cond_wait (@_pthread_cond_wait:304)
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) (@std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&):10)
___lldb_unnamed_symbol22733 (@___lldb_unnamed_symbol22733:19)
___lldb_unnamed_symbol22736 (@___lldb_unnamed_symbol22736:102)
BNExecuteOnMainThreadAndWait (@BNExecuteOnMainThreadAndWait:21)
BinaryNinja::ExecuteOnMainThreadAndWait(std::__1::function<void ()> const&)
BackgroundThread::runThread()
BackgroundThread::start(QVariant)::'lambda'()::operator()() const
decltype(std::declval<BackgroundThread::start(QVariant)::'lambda'()&>()()) std::__1::__invoke[abi:ne180100]<BackgroundThread::start(QVariant)::'lambda'()&>(BackgroundThread::start(QVariant)::'lambda'()&)
std::__1::invoke_result<BackgroundThread::start(QVariant)::'lambda'()&>::type std::__1::invoke[abi:ne180100]<BackgroundThread::start(QVariant)::'lambda'()&>(BackgroundThread::start(QVariant)::'lambda'()&)
QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'())::operator()(BackgroundThread::start(QVariant)::'lambda'()) const
decltype(std::declval<QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'()) const&>()(std::declval<BackgroundThread::start(QVariant)::'lambda'()>())) std::__1::__invoke[abi:ne180100]<QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'()) const&, BackgroundThread::start(QVariant)::'lambda'()>(QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'()) const&, BackgroundThread::start(QVariant)::'lambda'()&&)
decltype(auto) std::__1::__apply_tuple_impl[abi:ne180100]<QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'()) const&, std::__1::tuple<BackgroundThread::start(QVariant)::'lambda'()>, 0ul>(QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'()) const&, std::__1::tuple<BackgroundThread::start(QVariant)::'lambda'()>&&, std::__1::__tuple_indices<0ul>)
decltype(auto) std::__1::apply[abi:ne180100]<QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'()) const&, std::__1::tuple<BackgroundThread::start(QVariant)::'lambda'()>>(QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()::'lambda'(BackgroundThread::start(QVariant)::'lambda'()) const&, std::__1::tuple<BackgroundThread::start(QVariant)::'lambda'()>&&)
QtConcurrent::StoredFunctionCall<BackgroundThread::start(QVariant)::'lambda'()>::runFunctor()
QtConcurrent::RunFunctionTaskBase<void>::run()
QThreadPoolThread::run()
QThreadPrivate::start(void*)
_pthread_start

The deadlocking is permanent, Binary Ninja becomes not responding and its CPU usage drops to basically 0.

There is a chance this is user error but the deadlocking is occurring inside Binary Ninja core so I'm not sure how I could be the cause of it.

Steps To Reproduce:
I'm not sure how to reproduce with a vanilla copy of the specified version of Binary Ninja. I'm using a customised version of the DSC plugin that does allow more parallelization so it could be user error, but ultimately the hanging is due to something going on in Binary Ninja core.

The way I cause this deadlock is by the following steps:

  1. Open a copy of DYLD Shared Cache
  2. Whilst analysis is ongoing for that instance of DYLD Shared Cache, click File -> New Window.
  3. Open a different copy of DYLD Shared Cache.
  4. Observe deadlock.

The important parts here seem to be that analysis is going on for both at the same time and that both files that are opened are for DYLD Shared Cache.

I know thats not particularly helpful, I'm hoping from the stack traces it might be clear where multiple threads are either contending for the same lock or where a condition variable is not being notified.

Expected Behavior:
To be honest I don't really know. BeginUndoActions is a bit of an odd one because my understanding is it doesn't really work in multi-threaded cases due to the way undo actions have been architected. Obviously it shouldn't deadlock but does the use of BeginUndoActions in a multi-threaded situation make any sense? Given Binary Ninja is designed to be multi-threaded as much as possible does BeginUndoActions make sense in most situations because it won't necessarily batch the expected undo actions? The DSC plugin uses it in a number of places to either forget or batch undo actions, but given its multi-threaded nature would this even work as expected?

Metadata

Metadata

Assignees

No one assigned

    Labels

    File Format: SharedCacheIssue with the dyld_shared_cache pluginImpact: CriticalIssue blocks CRITICAL functionalityState: Awaiting TriageIssue is waiting for more in-depth triage from a developer

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions