zk的“异常”线程

最新推荐文章于 2022-03-26 15:23:56 发布

RomanBrickie

最新推荐文章于 2022-03-26 15:23:56 发布

阅读量1k

点赞数

分类专栏：多线程 zoomkeeper linux

本文链接：https://blog.csdn.net/RomanBrickie/article/details/8549984

版权

linux 同时被 3 个专栏收录

11 篇文章 0 订阅

订阅专栏

多线程

3 篇文章 0 订阅

订阅专栏

zoomkeeper

1 篇文章 0 订阅

订阅专栏

由于guard自身是多线程程序，所以每次有新的改动都会看看线程的数目是不是正确的。

在加入zk注册后，guard运行出现下列异常线程。

简注：如何看线程，gdb->attach 进程->thread apply all bt

(gdb) bt
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x000000000066048b in do_completion ()
#2 0x00007f86771969ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#3 0x00007f8676ef316d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4 0x0000000000000000 in ?? ()

(gdb) bt
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:211
#1 0x00000000005418fc in base::CondVar::WaitWithTimeout (this=0x1b564a0, mu=0x1b56470, millis=10000) at ./base/mutex.h:205
#2 0x0000000000540f4a in util::YRFSManager::RecoverConnection (this=0x1b56460) at util/yrfs/yrfs_manager.cc:387
#3 0x000000000054334c in base::_MemberResultCallback_0_0<true, void, util::YRFSManager>::Run (this=0x1b4a9a0) at ./base/callback_spec.h:119
#4 0x000000000068f71f in base::ThreadPool::Worker (p=0x1b7e200) at base/thread_pool.cc:38
#5 0x000000000068fb25 in base::WorkerThread::ThreadBody (my_thread=0x1b4aa40) at ./base/thread_pool.h:208
#6 0x00007f86771969ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#7 0x00007f8676ef316d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#8 0x0000000000000000 in ?? ()

(gdb) bt
#0 0x00007f8676ee69d3 in *__GI___poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=3332) at ../sysdeps/unix/sysv/linux/poll.c:87
#1 0x0000000000660695 in do_io ()
#2 0x00007f86771969ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#3 0x00007f8676ef316d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4 0x0000000000000000 in ?? ()

从网上查到这个话题

http://osdir.com/ml/java-hadoop-zookeeper-devel/2010-03/msg00233.html

这个话题的标题对解决本问题的帮助很大，

The C Client cannot exit properly in some situation

初步怀疑zk client没有正常退出

再看代码，发现情况。具体如下：

main() {

scoped_ptr<util::YRNSManager> yrns; ============>这句在if clause前面

if (!FLAGS_guard_yrns_path.empty()) {

yrns.reset(new util::YRNSManager());
std::string path = StringPrintf("%s/%d",
FLAGS_guard_yrns_path.c_str(),
FLAGS_guard_shard_id);
if (monitor_server->RegisterYRNS(
yrns.get(), path,
FLAGS_guard_replica_id)) {
LOG(INFO) << "Monitor:" << path
<<" id:" << FLAGS_guard_replica_id
<< " at port " << monitor_server->ServerPort();
} else {
LOG(ERROR) << "Failed to register monitor at port:"
<< monitor_server->ServerPort() << " to YRNS";
exit(1);
}
bool ret = yrns->Register(path,
FLAGS_guard_replica_id,
util::YRNSManager::SERVICE_RPC,
FLAGS_local_thrift_server_port);
CHECK(ret) << "Failed to register rpc monitor with:" << path;
} else {
LOG(ERROR) << "ZK path empty";
}