仍然是很久以前的记录
ORA-03113&ORA-07445;
今天weblogic jdbc突然中断几次,导致系统无法登陆,登陆时错误提示为:
Error 500--Internal Server Error
org.springframework.transaction.CannotCreateTransactionException: Could not open Hibernate Session for transaction; nested exception is org.hibernate.exception.GenericJDBCException: Cannot open connection
at org.springframework.orm.hibernate3.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:545)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:315)
at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:234)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:93)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:170)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:176)
at $Proxy11.findByPrimaryKey(Unknown Source)
at com.chinainsurance.application.common.CustomAuthenticationProcessingFilter.attemptAuthentication(CustomAuthenticationProcessingFilter.java:85)
at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:198)
at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
at org.acegisecurity.ui.logout.LogoutFilter.doFilter(LogoutFilter.java:108)
at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:195)
at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
at org.acegisecurity.util.FilterChainProxy.doFilter(FilterChainProxy.java:148)
at org.acegisecurity.util.FilterToBeanProxy.doFilter(FilterToBeanProxy.java:90)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:27)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:7053)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:3902)
at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2773)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:224)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:183)
Caused by: org.hibernate.exception.GenericJDBCException: Cannot open connection
at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:103)
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:91)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:29)
at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:420)
at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:144)
at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:129)
at org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:57)
at org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1290)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:517)
... 23 more
Caused by: java.sql.SQLException: Cannot obtain connection: driverURL = jdbc:weblogic:pool:ConnectionPool9999, props = {enableTwoPhaseCommit=false, connectionPoolID=ConnectionPool9999, jdbcTxDataSource=true, dataSourceName=chinainsuranceDataSource}.
Nested Exception: java.lang.RuntimeException: Failed to setAutoCommit to true for pool connection.
at weblogic.jdbc.wrapper.PoolConnection.init(PoolConnection.java:56)
at weblogic.jdbc.pool.Driver.allocateConnection(Driver.java:254)
at weblogic.jdbc.pool.Driver.connect(Driver.java:164)
at weblogic.jdbc.jts.Driver.getNonTxConnection(Driver.java:546)
at weblogic.jdbc.jts.Driver.connect(Driver.java:140)
at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:344)
at org.springframework.orm.hibernate3.LocalDataSourceConnectionProvider.getConnection(LocalDataSourceConnectionProvider.java:81)
at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:417)
at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:144)
at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:129)
at org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:57)
at org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1290)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:517)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:315)
at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:234)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:93)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:170)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:176)
at $Proxy11.findByPrimaryKey(Unknown Source)
at com.chinainsurance.application.common.CustomAuthenticationProcessingFilter.attemptAuthentication(CustomAuthenticationProcessingFilter.java:85)
at org.acegisecurity.ui.AbstractProcessingFilter.doFilter(AbstractProcessingFilter.java:198)
at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
at org.acegisecurity.ui.logout.LogoutFilter.doFilter(LogoutFilter.java:108)
at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
at org.acegisecurity.context.HttpSessionContextIntegrationFilter.doFilter(HttpSessionContextIntegrationFilter.java:195)
at org.acegisecurity.util.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:274)
at org.acegisecurity.util.FilterChainProxy.doFilter(FilterChainProxy.java:148)
at org.acegisecurity.util.FilterToBeanProxy.doFilter(FilterToBeanProxy.java:90)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:27)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:7053)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:3902)
at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2773)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:224)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:183)
at weblogic.jdbc.jts.Driver.wrapAndThrowSQLException(Driver.java:497)
at weblogic.jdbc.jts.Driver.getNonTxConnection(Driver.java:550)
at weblogic.jdbc.jts.Driver.connect(Driver.java:140)
at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:344)
at org.springframework.orm.hibernate3.LocalDataSourceConnectionProvider.getConnection(LocalDataSourceConnectionProvider.java:81)
at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:417)
... 28 more
重置JDBC后,可以正常使用,但是过20多分钟问题依旧。
我同时连接的会话却一直正常,没有出现过网络中断。
心里这时想的是肯定是应用开发有问题,本来这个测试环境就是上的新版本应用嘛。
出于职业习惯,检查了一下alert文件(原来是没有做日志监控的,现在包括测试库,自动10分钟日志监控,应该可以避免问题没有及时发现),发现有如下提示:
Fri Dec 7 12:03:23 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_7208.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Fri Dec 7 12:21:34 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_7388.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Fri Dec 7 13:17:40 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_8022.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Fri Dec 7 14:02:10 2007
Errors in file /opt/oracle/admin/matest/udump/matest_ora_10751.trc:
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
对照时间,和发生问题的时间一样,
靠,居然还是我数据库的问题。
打开其中一个跟踪文件:
内容截取如下:
/opt/oracle/admin/matest/udump/matest_ora_4292.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /ORACLE/product/10.2/db_1
System name: Linux
Node name: linux-testdb
Release: 2.6.16.21-0.8-bigsmp
Version: #1 SMP Mon Jul 3 18:25:39 UTC 2006
Machine: i686
Instance name: matest
Redo thread mounted by this instance: 1
Oracle process number: 32
Unix process pid: 4292, image:
*** 2007-12-07 10:26:36.956
*** SERVICE NAME:(SYS$USERS) 2007-12-07 10:26:36.955
*** SESSION ID:(126.126) 2007-12-07 10:26:36.955
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x8, PC: [0x901111c, kkdcacr()+3280]
Registers:
%eax: 0x8ca14614 %ebx: 0xb70e601c %ecx: 0x00000000
%edx: 0x00000000 %edi: 0xb70e4608 %esi: 0x8ca1448c
%esp: 0xbff38648 %ebp: 0xbff38754 %eip: 0x0901111c
%efl: 0x00010246
kkdcacr()+3270 (0x9011112) mov %eax,%edx
kkdcacr()+3272 (0x9011114) add $12,%esp
kkdcacr()+3275 (0x9011117) jmp 0x901111c
kkdcacr()+3277 (0x9011119) mov 0x44(%esi),%edx
> kkdcacr()+3280 (0x901111c) mov 0x8(%edx),%ecx
kkdcacr()+3283 (0x901111f) cmp 0x8(%edi),%ecx
kkdcacr()+3286 (0x9011122) jne 0x9011161
kkdcacr()+3288 (0x9011124) mov 0xc(%edx),%ecx
kkdcacr()+3291 (0x9011127) cmp 0xc(%edi),%ecx
*** 2007-12-07 10:26:36.959
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kkdcacr()+3280] [SIGSEGV] [Address not mapped to object] [0x8] [] []
Current SQL statement for this session:
select count(*) from GuPolicyCopyMain em,GuPolicyCopyRisk er,GuPolicyCopyEndorHead eh where em.policyNo=er.policyNo and em.endorSeqNo=er.endorSeqNo and er.policyNo=eh.policyNo and er.endorSeqNo=eh.endorSeqNo and em.endorSeqNo!='000' AND (em.policyNo = 'PFNU03004403') AND (er.uwYear = '2007') AND (eh.validInd = '1') AND (eh.inputDate>=to_date('2007-12-06', 'YYYY-MM-DD HH24:MI:SS')) AND (eh.underWriteInd = '6')
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+27 call ksedst1() 1 ? 1 ?
ksedmp()+557 call ksedst() 1 ? B72D4496 ? 0 ? 0 ?
B73E34AC ? B72B0003 ?
ssexhd()+863 call ksedmp() 3 ? 901111C ? 63646B6B ?
28726361 ? 32332B29 ? 3038 ?
kkdcacr()+3280 signal 00000000 B ? B72C0C8C ? B72C0D0C ?
kkqvIsPrimaryForeig call kkdcacr() B70EA6E0 ? 8CC50C9C ?
nKey()+79 915C9BB8 ? 31E ?
kkqvipr()+579 call kkqvIsPrimaryForeig B70EA374 ? B70EA2D8 ?
nKey()
kkqvacu()+178 call kkqvipr() B70EA224 ? B70EA2D8 ?
kkqvcrf()+233 call kkqvacu() B70EA580 ? B70EA2D8 ?
BFF388F8 ?
kkqvcerq()+667 call kkqvcrf() B70EA580 ? BFF388F8 ?
vopqbc()+2379 call kkqvcerq() B70EAAE4 ? 0 ? 0 ? 0 ? 0 ?
0 ?
vopdrv()+458 call vopqbc() B70EAAE4 ? 0 ? 0 ? 1 ? 1 ?
0 ? 0 ? BFF38AA4 ? 0 ? 1 ?
0 ?
kkqdrv()+1145 call vopdrv() B70EAAE4 ? 3 ? 1 ?
opiSem()+6114 call kkqdrv() B70EAAE4 ? 1 ?
opiDeferredSem()+61 call opiSem() BFF39400 ? 9382B96C ? 1A2 ?
1 1 ?
opitca()+12475 call opiDeferredSem() 0 ? 0 ? 26 ? 0 ? BFF3E9BC ?
C24A63D ?
kksFullTypeCheck()+ call opitca() B71179A0 ? 8CC50C9C ?
20
rpiswu2()+1061 call 00000000 BFF3F30C ? BFF3F130 ? 1 ?
13 ? 915C9D44 ? 915C9D08 ?
kksLoadChild()+1313 call rpiswu2() A237DAF0 ? 3D ? 915C9D44 ?
8 13 ? 915C9F0C ? 3D ?
915C9D70 ? 0 ? 9573680 ? 0 ?
BFF3F30C ? 0 ?
kkslod()+213 call kksLoadChild() CBC2A40 ? 9A5F75E0 ?
BFF4029C ?
kglobld()+1556 call 00000000 CBC2A40 ? 9A5F75E0 ?
BFF4029C ?
kglobpn()+2419 call kglobld() CBC2A40 ? 915C9BB8 ?
BFF4029C ? 9FE774F0 ? 0 ?
kglpim()+263 call kglobpn() CBC2A40 ? 9FE774F0 ?
BFF4029C ? 1 ?
kglpin()+1110 call kglpim() CBC2A40 ? BFF4029C ?
9FE774F0 ?
kxsGetRuntimeLock() call kglpin() CBC2A40 ? BFF4029C ?
+410 9FE35724 ? 9A5F75E0 ? 3 ? 3 ?
B71179C8 ?
kksfbc()+9915 call kxsGetRuntimeLock() CBC2A40 ? B71179A0 ?
BFF4029C ? 3 ? 1 ?
kkspsc0()+2191 call kksfbc() B71179A0 ? 3 ? 108 ?
BFF418FC ? 1A3 ? 0 ? 0 ? 0 ?
kksParseCursor()+45 call kkspsc0() B7100CE4 ? BFF418FC ? 1A3 ?
3 3 ? 6 ? A4 ?
opiosq0()+2054 call kksParseCursor() BFF40AD8 ? CB740C8 ?
CBC2A40 ? 0 ? B7120274 ? 0 ?
kpooprx()+250 call opiosq0() 3 ? E ? BFF40D10 ? A4 ?
kpoal8()+867 call kpooprx() BFF42EE8 ? BFF418FC ? 1A2 ?
1 ? 0 ? A4 ?
opiodr()+2347 call 00000000 5E ? 17 ? BFF42EE4 ?
ttcpip()+4227 call 00000000 5E ? 17 ? BFF42EE4 ? 0 ?
CC37296 ? E ?
opitsk()+1991 call ttcpip() CBCA240 ? 5E ? BFF42EE4 ? 0 ?
BFF423C4 ? BFF43008 ?
opiino()+1387 call opitsk() 0 ? 0 ?
opiodr()+2347 call 00000000 3C ? 4 ? BFF43AD0 ?
opidrv()+915 call opiodr() 3C ? 4 ? BFF43AD0 ? 0 ?
sou2o()+113 call opidrv() 3C ? 4 ? BFF43AD0 ?
opimai_real()+212 call sou2o() BFF43AB4 ? 3C ? 4 ?
BFF43AD0 ?
main()+111 call opimai_real() 2 ? BFF43B00 ?
__libc_start_main() call 00000000 2 ? BFF43BC4 ? BFF43BD0 ?
+220 B7FB024D ? 0 ? B73E34E8 ?
--------------------- Binary Stack Dump ---------------------
========== FRAME. [1] (ksedst()+27 -> ksedst1()) ==========
Dump of memory from 0xB72BEE98 to 0xB72BEEA8
B72BEE90 B72BEEA8 082E3A21 [..+.!:..]
B72BEEA0 00000001 00000001 [........]
========== FRAME. [2] (ksedmp()+557 -> ksedst()) ==========
上网去查找ora-07445的内容,其中有一个建议是如下的:
oracle的BUG,在初始化文件里面设置参数
alter system set event = "10262 trace name context forever, level 1024000" scope=spfile;
重启再看看还会不会产生这种错误。
按照这个方法作修改,重启数据库,10多分钟后,问题又来了。
苦于没有metlink帐号(付了钱的,但是过了半年还没有拿到帐号),还是仔细看看跟踪文件的提示吧。
看了一会儿还是看不出什么问题,不过每个文件里面都有以下的语句
Current SQL statement for this session:
select count(*) from GuPolicyCopyMain em,GuPolicyCopyRisk er,GuPolicyCopyEndorHead eh where em.policyNo=er.policyNo and em.endorSeqNo=er.endorSeqNo and er.policyNo=eh.policyNo and er.endorSeqNo=eh.endorSeqNo and em.endorSeqNo!='000' AND (em.policyNo = 'PFNU03004403') AND (er.uwYear = '2007') AND (eh.validInd = '1') AND (eh.inputDate>=to_date('2007-12-06', 'YYYY-MM-DD HH24:MI:SS')) AND (eh.underWriteInd = '6')
引起了注意,好歹看看这句语句吧。执行这句语句,报ORA-03113的错误。
问题根源找到!!!!!!!!!!!
单独查询任何一个表都是没有问题的。
怀疑是约束或索引有问题,先禁用掉这些表的约束试试吧。
禁用掉这些表相关的约束后,再运行这条语句,成功执行。
为了找出哪个表有问题,只得一个个表启用约束,在看是否可以执行,终于在执行启用GuPolicyCopyRisk的
pk_rf91时,约束一启用,执行那条语句就有ORA-03113的错误。
解决办法:
删掉主键pk_rf91,
用drop constraint pk_rf91时,提示有外键引用到这个主键,不让删。
先禁用掉所有的外键吧,结果还是一样的问题。
于是来一招狠的吧,drop constraint pk_rf91 cascade(删之前查一下有哪些外键应用这个约束,并且获得生成他们的语句,这个主键可是有11个外键引用呢!)
再重新生成主键,外键。
按照此方法执行,到目前为止是没有异常了。