oracle+监听经常不正常,Oracle监听不定期出现异常故障处理

环境:

操作系统为:SunOS 5.10

数据库版本:Oracle RAC 11.2.0.3.0

该主机上有2个库,一套RAC的节点1在上面,还有另外一个库,之所以使用的是DB下的监听而没用GRID下的监听,是为了避免停掉CRS时影响另外一个库的使用。

故障现象:

数据库监听不定期出现异常 ,从应用tnsping数据库,时间花费很长甚至连不上

故障分析处理过程:

从应用主机tnsping如下:

racdb1_scenemon$tnsping racdb_new

TNS Ping Utility for Solaris: Version 11.2.0.3.0 - Production on 19-2月 -2014 15:46:23

Copyright (c) 1997, 2011, Oracle. All rights reserved.

已使用的参数文件:

已使用 TNSNAMES 适配器来解析别名

尝试连接 (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 11.111.11.1)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = 11.111.11.2)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = racdb) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5))))

OK (60000 毫秒)

从数据库服务器上查看监听状态:

$ lsnrctl status

LSNRCTL for Solaris: Version 11.2.0.3.0 - Production on 19-FEB-2014 15:45:58

Copyright (c) 1991, 2011, Oracle. All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))

TNS-12535: TNS:operation timed out

TNS-12560: TNS:protocol adapter error

TNS-00505: Operation timed out

将监听进程予以kill并重启监听:

$ ps -ef |grep tns

oracle 18504 18468 0 15:45:15 pts/10 0:00 grep tns

oracle 24037 1 0 Nov 28 ? 6:08 /oracle/app/asm/11.2.0/grid/bin/tnslsnr LISTENER -inherit

oracle 9488 1 3 Jan 24 ? 4447:11 /oracle/app/db/product/11.2.0/db/bin/tnslsnr LISTENER -inherit

$

$ kill -9 9488

$ lsnrctl start

LSNRCTL for Solaris: Version 11.2.0.3.0 - Production on 19-FEB-2014 15:48:22

Copyright (c) 1991, 2011, Oracle. All rights reserved.

Starting /oracle/app/db/product/11.2.0/db/bin/tnslsnr: please wait...

TNSLSNR for Solaris: Version 11.2.0.3.0 - Production

Log messages written to /oracle/app/db/product/11.2.0/db/log/diag/tnslsnr/racdb/listener/alert/log.xml

Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=racdb)(PORT=1521)))

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))

STATUS of the LISTENER

------------------------

Alias LISTENER

Version TNSLSNR for Solaris: Version 11.2.0.3.0 - Production

Start Date 19-FEB-2014 15:48:22

Uptime 0 days 0 hr. 0 min. 0 sec

Trace Level off

Security ON: Local OS Authentication

SNMP OFF

Listener Log File /oracle/app/db/product/11.2.0/db/log/diag/tnslsnr/racdb/listener/alert/log.xml

Listening Endpoints Summary...

(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=racdb)(PORT=1521)))

The listener supports no services

The command completed successfully

检查listener.log如下:

17-FEB-2014 16:02:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41603)) * establish * racdb * 0

17-FEB-2014 16:02:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41659)) * establish * racdb * 12518

TNS-12518: TNS:listener could not hand off client connection

TNS-12560: TNS:protocol adapter error

TNS-00530: Protocol adapter error

Solaris Error: 24: Too many open files <<<<<<<<<

17-FEB-2014 16:02:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41652)) * establish * racdb * 0

17-FEB-2014 16:02:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41669)) * establish * racdb * 0

17-FEB-2014 16:02:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41682)) * establish * racdb * 12518

TNS-12518: TNS:listener could not hand off client connection

TNS-12560: TNS:protocol adapter error

TNS-00530: Protocol adapter error

Solaris Error: 24: Too many open files <<<<<<<<<<

Too many open files意味着Maximum Number Of Open Files Per Process 达到了上限。因此listener hang住的原因可能是该limit设置过小,进一步查看硬件限制如下:

$ ulimit -Ha

time(seconds) unlimited

file(blocks) unlimited

data(kbytes) unlimited

stack(kbytes) unlimited

coredump(blocks) unlimited

nofiles(descriptors) 65536

memory(kbytes) unlimited

查看软件限制如下:

$ ulimit -Sa

time(seconds) unlimited

file(blocks) unlimited

data(kbytes) unlimited

stack(kbytes) 8192

coredump(blocks) unlimited

nofiles(descriptors) 256

memory(kbytes) unlimited

查看/etc/system如下:

*ident  "@(#)system    1.18    97/06/27 SMI" /* SVR4 1.5 */

*

* SYSTEM SPECIFICATION FILE

*

* moddir:

*

*      Set the search path for modules.  This has a format similar to the

*      csh path variable. If the module isn't found in the first directory

*      it tries the second and so on. The default is /kernel /usr/kernel

*

*      Example:

*              moddir: /kernel /usr/kernel /other/modules

* root device and root filesystem configuration:

*

*      The following may be used to override the defaults provided by

*      the boot program:

*

*      rootfs:        Set the filesystem type of the root.

*

*      rootdev:        Set the root device.  This should be a fully

*                      expanded physical pathname.  The default is the

*                      physical pathname of the device where the boot

*                      program resides.  The physical pathname is

*                      highly platform and configuration dependent.

*

*      Example:

*              rootfs:ufs

*              rootdev:/sbus@1,f8000000/esp@0,800000/sd@3,0:a

*

*      (Swap device configuration should be specified in /etc/vfstab.)

* exclude:

*

*      Modules appearing in the moddir path which are NOT to be loaded,

*      even if referenced. Note that `exclude' accepts either a module name,

*      or a filename which includes the directory.

*

*      Examples:

*              exclude: win

*              exclude: sys/shmsys

* forceload:

*

*      Cause these modules to be loaded at boot time, (just before mounting

*      the root filesystem) rather than at first reference. Note that

*      forceload expects a filename which includes the directory. Also

*      note that loading a module does not necessarily imply that it will

*      be installed.

*

*      Example:

*              forceload: drv/foo

* set:

*

*      Set an integer variable in the kernel or a module to a new value.

*      This facility should be used with caution.  See system(4).

*

*      Examples:

*

*      To set variables in 'unix':

*

*              set nautopush=32

*              set maxusers=40

*

*      To set a variable named 'debug' in the module named 'test_module'

*

*              set test_module:debug = 0x13

* Begin FJSVssf (do not edit)

set ftrace_atboot = 1

set kmem_flags = 0x100

set kmem_lite_maxalign = 8192

* End FJSVssf (do not edit)

forceload: drv/fjpfca

* Begin MDD root info (do not edit)

rootdev:/pseudo/md@0:0,0,blk

* End MDD root info (do not edit)

*** Begin EMCpower added lines *** DO NOT EDIT BELOW THIS LINE ***

forceload: drv/emcpsf

forceload: drv/sd

forceload: drv/ssd

forceload: drv/emcp

forceload: misc/emcpgpx

forceload: misc/emcpmpx

forceload: misc/emcpvlumd

forceload: misc/emcpxcrypt

forceload: misc/emcpdm

forceload: misc/emcpioc

set emcp:bPxEnableInit=1

*** End EMCpower added lines *** DO NOT EDIT ABOVE THIS LINE ***

exec_user_stack = 1

set noexec_user_stack_log = 1

查看/etc/project如下:

system:0::::

user.root:1::::

noproject:2::::

default:3::::

group.staff:10::::

user.oracle:100::oracle::

综上所述,从listener.log的输出看,nofiles的soft limit过小,只有256,这会导致Solaris Error: 24: Too many open files 错误。从/etc/system和/etc/project,我们没有看到设置了soft limit。

故障处理小结及后续建议

将oracle用户的soft limit提升为至少1024,然后重新oracle用户登录,检验ulimit合格后,重新启动数据库和监听。

具体解决办法如下:

1、在/etc/system增加以下行

set rlim_fd_max=65536

set rlim_fd_cur=4096

2、重新登录ORACLE并检验oracle用户的限制

su – oracle

ulimit -Ha

ulimit –Sa

3、重新启动数据库和监听

0b1331709591d260c1c78e86d0c51c18.png

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值