20101122
1. change pwd to NgtfPush2 (Global987)
2. sqlapi core when exit
3. memdb test
4. push app doc
g++ -I/home/ngtf/ngtf_src/product/v8t/trade_mdb/output -I/home/ngtf/ngtf_src/kingstar/kssmdb demo_memdb.cpp /home/ngtf/ngtf_src/product/v8t/trade_mdb/output/tradememorydb.cpp -L/home/ngtf/ngtf_src/kingstar/kssmdb/linux64 -lksmdb -lkcrypto -L. -lboost_thread -lpthread -I/home/ngtf/ngtf_src/others/boost/boost_1_29_0
20101123
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:390)删除进程号[pid=11488, exe_name=tsbu]
1956296319 01:21:25 3526 Level3000 -- INFO--kill[pid=11488, exe_name=tsbu]
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:65)等待进程退出[pid=11488].....
1956296319 01:21:25 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:100)waitpid[cmd=query_bu, pid=11451] error: 10-No child processes
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:147)监测到进程被Monitor关闭[dir_path=/v8t/trade_a/trade_a/query_bu/bin, exe_name=query_bu, pid=11451]
1956296319 01:21:25 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:100)waitpid[cmd=query_bu, pid=11449] error: 10-No child processes
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:147)监测到进程被Monitor关闭[dir_path=/v8t/trade_a/trade_a/query_bu/bin, exe_name=query_bu, pid=11449]
1956300825 01:21:29 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:107)waitpid[cmd=tsbu, pid=11580], exited, status=0
1956300825 01:21:29 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:140)监测到进程自己消失[dir_path=/v8t/trade_a/trade_a/tsbu/bin, exe_name=tsbu, pid=11580]
1956300825 01:21:29 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:162)删除进程号[pid=11580, exe_name=tsbu]
1956302869 01:21:31 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:86)进程已退出[pid=11488]
1956302870 01:21:31 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:100)waitpid[cmd=tsbu, pid=11488] error: 10-No child processes
1956302870 01:21:31 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:147)监测到进程被Monitor关闭[dir_path=/v8t/trade_a/trade_a/tsbu/bin, exe_name=tsbu, pid=11488]
1956309556 01:21:38 3526 Level2000 -- DEBUG--(File:../../../ngtf/server/app_agent/src/access_trade_mdb.cpp Line:118)GetAppStatusRecored CommitTransaction()
1956310558 01:21:39 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:348)m_id=0
m_kernel_flag=1
20101124
调试app_agent的问题.
1. 通过在Prog_Mgr线程启动时,暂停3秒钟,重现会多启动进程的bug
2. 调试锁的范围
3. WIFEXITED(status)返回127时, 意思是没有找到运行路径.
20101125
1. 测试App_agent加入m_curret_count的修改
2160588699 10:06:17 19914 Level2000 -- DEBUG--(File:../../../ngtf/server/app_agent/src/appserver.cpp Line:47)Waiting for all processes stopped
2160589077 10:06:17 19914 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:351)m_id=0
m_kernel_flag=1
m_dir_path=/home/ngtf/trade_a/kssmdb
m_exe_name=ksmdbmanage
m_start_script=ksmdbmanage
m_host_mode_active_count=1
m_back_mode_active_count=1
m_need_count=1
m_pid_set.size()=2
m_current_count=1
2160589279 10:06:18 19914 Level3000 -- INFO--内存库未启动,本次监控不调整其他模块进程个数
2160589700 10:06:18 19914 Level2000 -- DEBUG--(File:../../../ngtf/server/app_agent/src/appserver.cpp Line:47)Waiting for all processes stopped
2160590281 10:06:19 19914 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:351)m_id=0
m_kernel_flag=1
m_dir_path=/home/ngtf/trade_a/kssmdb
m_exe_name=ksmdbmanage
m_start_script=ksmdbmanage
m_host_mode_active_count=1
m_back_mode_active_count=1
m_need_count=1
m_pid_set.size()=2
m_current_count=1
2160590482 10:06:19 19914 Level3000 -- INFO--内存库未启动,本次监控不调整其他模块进程个数
ngtf_push_bu.h中注释说明
push_thread 退出处理
ksmdbmanage异常退出后, 会导致trade_server 进程消失:
INFO_MSG("%s", "commit transaction");
DB_tradememorydb::CommitTransaction();
INFO_MSG("%s", "begin transaction");
DB_tradememorydb::BeginTransaction();
2176200248 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:176)核心进程[trade_server]消失
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:258)开始执行核心进程故障指令
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:264)DoKernelError set monitor mutex
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:268)Have set the expect status as STOP
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:272)删除消失进程的PID[pid=25258]
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:351)m_id=4
m_kernel_flag=0
m_dir_path=/home/ngtf/trade_a/tsbu/bin
m_exe_name=tsbu
m_start_script=tsbu
m_host_mode_active_count=1
m_back_mode_active_count=0
m_need_count=0
m_pid_set.size()=1
m_current_count=1
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:391)删除进程号[pid=25277, exe_name=tsbu]
2176200248 14:26:29 25248 Level3000 -- INFO--kill[pid=25277, exe_name=tsbu]
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:65)等待进程退出[pid=25277].....
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:279)等待监控线程将实际状态调整为已全部停止状态
2176201131 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:101)waitpid[cmd=ksmdbmanage, pid=25255] error: 10-No child processes
2176201131 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:141)监测到进程自己消失[dir_path=/home/ngtf/trade_a/kssmdb, exe_name=ks
mdbmanage, pid=25255]
2176201131 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:156)Try to close memory db, open_flag is 1
2176201131 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/access_trade_mdb.cpp Line:174)Finish backup App Status table
2176201132 14:26:29 25248 Level2000 -- DEBUG--关闭内存库
2176201132 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:176)核心进程[ksmdbmanage]消失
2176205555 14:26:34 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:82)waitpid[pid=25277] error: 10-No child processes
20101126
多个ksmbcc共享内存的问题
提交app_agent的修改
1. change pwd to NgtfPush2 (Global987)
2. sqlapi core when exit
3. memdb test
4. push app doc
g++ -I/home/ngtf/ngtf_src/product/v8t/trade_mdb/output -I/home/ngtf/ngtf_src/kingstar/kssmdb demo_memdb.cpp /home/ngtf/ngtf_src/product/v8t/trade_mdb/output/tradememorydb.cpp -L/home/ngtf/ngtf_src/kingstar/kssmdb/linux64 -lksmdb -lkcrypto -L. -lboost_thread -lpthread -I/home/ngtf/ngtf_src/others/boost/boost_1_29_0
20101123
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:390)删除进程号[pid=11488, exe_name=tsbu]
1956296319 01:21:25 3526 Level3000 -- INFO--kill[pid=11488, exe_name=tsbu]
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:65)等待进程退出[pid=11488].....
1956296319 01:21:25 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:100)waitpid[cmd=query_bu, pid=11451] error: 10-No child processes
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:147)监测到进程被Monitor关闭[dir_path=/v8t/trade_a/trade_a/query_bu/bin, exe_name=query_bu, pid=11451]
1956296319 01:21:25 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:100)waitpid[cmd=query_bu, pid=11449] error: 10-No child processes
1956296319 01:21:25 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:147)监测到进程被Monitor关闭[dir_path=/v8t/trade_a/trade_a/query_bu/bin, exe_name=query_bu, pid=11449]
1956300825 01:21:29 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:107)waitpid[cmd=tsbu, pid=11580], exited, status=0
1956300825 01:21:29 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:140)监测到进程自己消失[dir_path=/v8t/trade_a/trade_a/tsbu/bin, exe_name=tsbu, pid=11580]
1956300825 01:21:29 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:162)删除进程号[pid=11580, exe_name=tsbu]
1956302869 01:21:31 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:86)进程已退出[pid=11488]
1956302870 01:21:31 3526 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:100)waitpid[cmd=tsbu, pid=11488] error: 10-No child processes
1956302870 01:21:31 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:147)监测到进程被Monitor关闭[dir_path=/v8t/trade_a/trade_a/tsbu/bin, exe_name=tsbu, pid=11488]
1956309556 01:21:38 3526 Level2000 -- DEBUG--(File:../../../ngtf/server/app_agent/src/access_trade_mdb.cpp Line:118)GetAppStatusRecored CommitTransaction()
1956310558 01:21:39 3526 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:348)m_id=0
m_kernel_flag=1
20101124
调试app_agent的问题.
1. 通过在Prog_Mgr线程启动时,暂停3秒钟,重现会多启动进程的bug
2. 调试锁的范围
3. WIFEXITED(status)返回127时, 意思是没有找到运行路径.
20101125
1. 测试App_agent加入m_curret_count的修改
2160588699 10:06:17 19914 Level2000 -- DEBUG--(File:../../../ngtf/server/app_agent/src/appserver.cpp Line:47)Waiting for all processes stopped
2160589077 10:06:17 19914 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:351)m_id=0
m_kernel_flag=1
m_dir_path=/home/ngtf/trade_a/kssmdb
m_exe_name=ksmdbmanage
m_start_script=ksmdbmanage
m_host_mode_active_count=1
m_back_mode_active_count=1
m_need_count=1
m_pid_set.size()=2
m_current_count=1
2160589279 10:06:18 19914 Level3000 -- INFO--内存库未启动,本次监控不调整其他模块进程个数
2160589700 10:06:18 19914 Level2000 -- DEBUG--(File:../../../ngtf/server/app_agent/src/appserver.cpp Line:47)Waiting for all processes stopped
2160590281 10:06:19 19914 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:351)m_id=0
m_kernel_flag=1
m_dir_path=/home/ngtf/trade_a/kssmdb
m_exe_name=ksmdbmanage
m_start_script=ksmdbmanage
m_host_mode_active_count=1
m_back_mode_active_count=1
m_need_count=1
m_pid_set.size()=2
m_current_count=1
2160590482 10:06:19 19914 Level3000 -- INFO--内存库未启动,本次监控不调整其他模块进程个数
ngtf_push_bu.h中注释说明
push_thread 退出处理
ksmdbmanage异常退出后, 会导致trade_server 进程消失:
INFO_MSG("%s", "commit transaction");
DB_tradememorydb::CommitTransaction();
INFO_MSG("%s", "begin transaction");
DB_tradememorydb::BeginTransaction();
2176200248 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:176)核心进程[trade_server]消失
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:258)开始执行核心进程故障指令
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:264)DoKernelError set monitor mutex
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:268)Have set the expect status as STOP
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:272)删除消失进程的PID[pid=25258]
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:351)m_id=4
m_kernel_flag=0
m_dir_path=/home/ngtf/trade_a/tsbu/bin
m_exe_name=tsbu
m_start_script=tsbu
m_host_mode_active_count=1
m_back_mode_active_count=0
m_need_count=0
m_pid_set.size()=1
m_current_count=1
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/monitor.cpp Line:391)删除进程号[pid=25277, exe_name=tsbu]
2176200248 14:26:29 25248 Level3000 -- INFO--kill[pid=25277, exe_name=tsbu]
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:65)等待进程退出[pid=25277].....
2176200248 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/execute_cmd.cpp Line:279)等待监控线程将实际状态调整为已全部停止状态
2176201131 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:101)waitpid[cmd=ksmdbmanage, pid=25255] error: 10-No child processes
2176201131 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:141)监测到进程自己消失[dir_path=/home/ngtf/trade_a/kssmdb, exe_name=ks
mdbmanage, pid=25255]
2176201131 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:156)Try to close memory db, open_flag is 1
2176201131 14:26:29 25248 Level3000 -- INFO--(File:../../../ngtf/server/app_agent/src/access_trade_mdb.cpp Line:174)Finish backup App Status table
2176201132 14:26:29 25248 Level2000 -- DEBUG--关闭内存库
2176201132 14:26:29 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/process_mgr.cpp Line:176)核心进程[ksmdbmanage]消失
2176205555 14:26:34 25248 Level6000 -- FATAL--(File:../../../ngtf/server/app_agent/src/pubfunc.cpp Line:82)waitpid[pid=25277] error: 10-No child processes
20101126
多个ksmbcc共享内存的问题
提交app_agent的修改