Skip to content

bug: use "dup remove" command stop duplication, cluster may coredump sometimes #2211

@ninsmiracle

Description

@ninsmiracle

Bug Report

  1. What did you do?
    Use "dup remove" command stop a duplication, and all the nodes of the cluster coredump.

one node has this coredump info, and the coredump information of other nodes is very confusing.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/home/work/app/pegasus/c3srv-xxx/replica/package/bin/pegasus_server con'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  SLL_Next (t=0x0) at src/linked_list.h:45
45      src/linked_list.h: No such file or directory.
(gdb) bt
#0  SLL_Next (t=0x0) at src/linked_list.h:45
#1  SLL_PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=32, head=0x2085a00) at src/linked_list.h:88
#2  PopRange (end=<synthetic pointer>, start=<synthetic pointer>, N=32, this=0x2085a00) at src/thread_cache.h:238
#3  tcmalloc::ThreadCache::ReleaseToCentralCache (this=this@entry=0x20859c0, src=src@entry=0x2085a00, cl=cl@entry=2, N=89) at src/thread_cache.cc:201
#4  0x00007fd342586677 in tcmalloc::ThreadCache::Scavenge (this=0x20859c0) at src/thread_cache.cc:224
#5  0x00007fd3410c363e in std::string::reserve(unsigned long) () from /home/work/app/pegasus/c3srv-feedprofile/replica/package/bin/libstdc++.so.6
#6  0x00007fd3410c3abf in std::string::append(std::string const&) () from /home/work/app/pegasus/c3srv-feedprofile/replica/package/bin/libstdc++.so.6
#7  0x000000000066ef2f in std::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#8  0x000000000080f129 in rocksdb::ParseFileName(std::string const&, unsigned long*, rocksdb::Slice const&, rocksdb::FileType*, rocksdb::WalFileType*) ()
#9  0x000000000080ebf9 in rocksdb::ParseFileName(std::string const&, unsigned long*, rocksdb::FileType*, rocksdb::WalFileType*) ()
#10 0x00000000007e25fd in rocksdb::WalManager::GetSortedWalsOfType(std::string const&, std::vector<std::unique_ptr<rocksdb::LogFile, std::default_delete<rocksdb::LogFile> >, std::allocator<std::unique_ptr<rocksdb::LogFile, std::default_delete<rocksdb::LogFile> > > >&, rocksdb::WalFileType) ()
#11 0x00000000007e0c10 in rocksdb::WalManager::GetSortedWalFiles(std::vector<std::unique_ptr<rocksdb::LogFile, std::default_delete<rocksdb::LogFile> >, std::allocator<std::unique_ptr<rocksdb::LogFile, std::default_delete<rocksdb::LogFile> > > >&) ()
#12 0x00000000009a0237 in rocksdb::DBImpl::GetSortedWalFiles(std::vector<std::unique_ptr<rocksdb::LogFile, std::default_delete<rocksdb::LogFile> >, std::allocator<std::unique_ptr<rocksdb::LogFile, std::default_delete<rocksdb::LogFile> > > >&) ()
#13 0x000000000092af60 in rocksdb::CheckpointImpl::CreateCustomCheckpoint(rocksdb::DBOptions const&, std::function<rocksdb::Status (std::string const&, std::string const&, rocksdb::FileType)>, std::function<rocksdb::Status (std::string const&, std::string const&, unsigned long, rocksdb::FileType)>, std::function<rocksdb::Status (std::string const&, std::string const&, rocksdb::FileType)>, unsigned long*, unsigned long) ()
#14 0x000000000092a5f9 in rocksdb::CheckpointImpl::CreateCheckpoint(std::string const&, unsigned long) ()
#15 0x00000000005a8b0d in pegasus::server::pegasus_server_impl::copy_checkpoint_to_dir_unsafe (this=this@entry=0x49a6f0000, checkpoint_dir=<optimized out>,
    checkpoint_decree=checkpoint_decree@entry=0x7fd2fc400a90, flush_memtable=flush_memtable@entry=false) at /home/work/temp/pegasus/src/server/pegasus_server_impl.cpp:2081
#16 0x00000000005a9da8 in pegasus::server::pegasus_server_impl::async_checkpoint (this=0x49a6f0000, flush_memtable=<optimized out>) at /home/work/temp/pegasus/src/server/pegasus_server_impl.cpp:1999
#17 0x00007fd3456a6e2e in dsn::replication::replica::background_async_checkpoint (this=0x43c0b0000, is_emergency=<optimized out>) at /home/work/temp/pegasus/src/rdsn/src/replica/replica_chkpt.cpp:242
#18 0x00007fd3458ae311 in dsn::task::exec_internal (this=this@entry=0xbeb204e10) at /home/work/temp/pegasus/src/rdsn/src/runtime/task/task.cpp:176
#19 0x00007fd3458c39c2 in dsn::task_worker::loop (this=0x2701760) at /home/work/temp/pegasus/src/rdsn/src/runtime/task/task_worker.cpp:224
#20 0x00007fd3458c3b40 in dsn::task_worker::run_internal (this=0x2701760) at /home/work/temp/pegasus/src/rdsn/src/runtime/task/task_worker.cpp:204
#21 0x00007fd34453bf5f in execute_native_thread_routine () from /home/work/app/pegasus/c3srv-feedprofile/replica/package/bin/libdsn_utils.so
#22 0x00007fd342345dc5 in start_thread () from /lib64/libpthread.so.0
#23 0x00007fd34084473d in clone () from /lib64/libc.so.6
(gdb) ^CQuit
  1. What version of Pegasus are you using?
    v2.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugThis issue reports a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions