AS数据库连接不上问题,中间件的启动界面停止在如下位置:[30706]Plugin[com.hundsun.fbase.security] init ...done!
[30706]Plugin[com.hundsun.fbase.hsdb] init ...
然后gdb进程查看线程堆栈,如下所示:
(gdb) t 1
[Switching to thread 1 (Thread 4160174688 (LWP 13126))]#0 0xffffe410 in
__kernel_vsyscall ()
(gdb) where
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x0070b3a2 in times () from /lib/tls/libc.so.6
#2 0xf762a651 in sltrgatime64 () from
/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#3 0xf712408b in kghinp () from
/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#4 0xf6d31467 in kpuinit0 () from
/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#5 0xf6d308ae in kpuinit () from
/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#6 0xf6dfc57e in OCIEnvInit () from
/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#7 0xf6cbe476 in sqlcxa () from /u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#8 0xf6cbf268 in sqlcfx () from
/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#9 0xf6ca025d in sqlcmex () from
/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#10 0xf6ca0b06 in sqlcxt () from /u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
#11 0xf6899709 in HSQLInitHandle () from
/home/hundsun/linux.i386/lib/libfhsdb_oracle10.so
#12 0xf69162f5 in CConnectionImpl::connectionDB () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#13 0xf69148c4 in CDataSourceImpl::m_CreateConnection () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#14 0xf69146a6 in CDataSourceImpl::CreatConnectionByCount () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#15 0xf6913809 in CDataSourceImpl::CDataSourceImpl () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#16 0xf690e828 in CHSDBImpl::mf_CreateDataSource () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#17 0xf690e601 in CHSDBImpl::mf_CreateAllDataSource () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#18 0xf690e289 in CHSDBImpl::OnInit () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#19 0xf6910628 in HSDBSvrInit () from
/home/hundsun/linux.i386/lib/libfsc_f2hsdb.so
#20 0xf7f9d4ea in CF2CoreImpl::mf_InitPlugin () from /home/hundsun/linux.i386/lib/libfsc_f2core.so
#21 0xf7f9c08c in CF2CoreImpl::Load () from
/home/hundsun/linux.i386/lib/libfsc_f2core.so
#22 0x0804ae34 in CShell::init ()
#23 0x0804dbd8 in main ()问题原因:根据函数sltrgatime64 ()上网搜索,发现了原来是oracle客户端(版本为10.2.0.1.0)一个bug
bug描述:事实上只要Linux x86主机运行天数是是24.8的倍数都有可能引发该bug,因为time()函数值为null,造成无限死循环,从而耗尽cpu。参考文档:Doc ID: 338461.1 SQL*Plus 10.2.0.1 Hangs, When System Uptime Is Long Period of
Time
Doc ID: 4612267.8 Bug 4612267 - OCI client spins when machine uptime >= 249
days详细信息可以参考sqlplus还是可以连上数据库,根据bug描述,也应该连不上数据库才对。通过ldd查看sqlplus的依赖库,我们发现原来调用的底层库不是一个。[oracle@txas1 ~]$ ldd /u01/app/oracle/product/10.2.0/db_1/bin/sqlplus
libsqlplus.so => /u01/app/oracle/product/10.2.0/db_1/lib/libsqlplus.so
(0x0000002a95557000)
libclntsh.so.10.1 =>
/u01/app/oracle/product/10.2.0/db_1/lib/libclntsh.so.10.1 (0x0000002a95748000)
libnnz10.so
=> /u01/app/oracle/product/10.2.0/db_1/lib/libnnz10.so (0x0000002a96a96000)
libdl.so.2
=> /lib64/libdl.so.2 (0x0000003fa9b00000)
libm.so.6
=> /lib64/tls/libm.so.6 (0x0000003fa9900000)
libpthread.so.0
=> /lib64/tls/libpthread.so.0 (0x0000003fa9d00000)
libnsl.so.1
=> /lib64/libnsl.so.1 (0x0000003faf300000)
libc.so.6
=> /lib64/tls/libc.so.6 (0x0000003fa9600000)
/lib64/ld-linux-x86-64.so.2 (0x0000003fa9200000)前面堆栈显示的我们调用的库是:/u01/app/oracle/product/10.2.0/db_1/lib32/libclntsh.so.10.1
sqlplus是64位的程序,我们的程序是32位的,导致我们的问题出现。解决办法:经常重启主机,避免天数到(治标不治本)更新orcle的补丁,对该bug单独打临时patch 4612267。升级oracle客户端,版本升级到10.2.0.4(10.2.0.1之上就行)