Diagnosing TNS listener Hangs [ID 1401376.1]

Applies to:

Oracle Net Services - Version: 9.2.0.1 and later [Release: 9.2 and later ]
Information in this document applies to any platform.

Goal

To be used to help diagnose and offer possible alternate setup or workarounds to issues with TNS listener process

Solution

TNS listener Hangs

Following are key questions to understand setup, background and history of the problem.These help find possible alternate setup's, which might give relief or ease the pressure on the problem:
  • How many times has the hang been seen?
  • How often is the hang happening, once a day, once a week, etc?
  • Is there any pattern to the issue, such as date, time, peak load?
  • When was the issue first seen or recorded and was there any changes to environment around this time, Oracle, Operating system, Hardware or Network?
  • What is the 5 digit version of Oracle binaries? Important to exclude any know problems, and advise on alternate setup.
  • Issue seen for more than one version of Oracle?
  • Issue seen for other TNS listeners, be that on same system or other systems and platforms?
Checks that can be done before key diagnostics is available:
  • Listener log.Check the log file size and for information like incoming connection rate, TNS errors, TNS ping commands, leading up to time of then hang. As a side note, stop any script. / job which is running continuous TNSping commands against the listener. These can slow down connections to the listener.
  • Check operating system error logs for issues seen at the same time and leading up to a recorded hang.
  • Check on operating system jobs or programs running, clearing up old files or removing long running processes.
  • Check listener.ora / sqlnet.ora file for parameters used, including those which do not have default values. Test a listener on the same system, with default settings used, is the issue still seen?
Diagnostics Required:

1.Enable level 16 listener trace and capture a couple of hangs. Add to LISTENER.ORA file
DIAG_ADR_ENABLED_ =off # Disable ADR if version 11g
TRACE_LEVEL_ = 16 # Enable level 16 trace
TRACE_TIMESTAMP_ = ON # Set timestamp in the trace files
TRACE_DIRECTORY_ = # Control trace file location
TRACE_FILELEN_ = # Control size of trace files. i
TRACE_FILENO_ = # Number of trace files per process

The TRACE_FILELEN parameter is for the size of a trace file.
The TRACE_FILENO parameter is the number of traces per process.
If cyclic tracing setup, ensure the size selected, is large enough so that trace capture data before a hang. Important Note: Ensure traces are copied off before listener is restarted, otherwise trace data will be lost.Reload of the TNS listener is required for tracing to start

Tracing needs to be enabled before any issue happens. Gathering of multiple traces, if possible, will help confirm the hang is always on or around the same functions / area. Using cyclic tracing will help limit size of trace output. A further option that maybe used is, If able to predict when the next hang will happen, follow Note:751432.1Further TNS listener Tracing.This will again, help limit amount of time tracing is enabled.

Then at the time of a hang...

2.Truss with time delta and Pstack the hanging TNSLSNR process.
If possible run the commands a couple of times, a few minutes a part. This helps to confirm if the problem is in fact a hung process or just a very slow one, ie if the function output changes between 1st and 2nd run of the commands.
Remember if the event takes a long time reproduce, then having truss on the listener process is going to produce large amount of information. Only follow this step for the primary listener, if able to predict when a hang may occur.

3.Check listener control (LSNRCTL) can connect and check the status of the TNSLSNR process. Gather screen output of lsnrctl status/services
Command confirms if the TNSLSNR process is still responding to none remote connections and helps show the status of the registered database services. Possibly excludes TCP/IP as an issue, depending on listener.ora file setup.

A secondary check here, is to ensure the first address in the description section of the listener.ora file, is IPC protocol. Example:
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = Key1))
(ADDRESS = (PROTOCOL = TCP)(HOST = sample.oracle.com)(PORT = 1521))
)
)
This will ensure IPC is used between LSNRCTL and TNSLNR. This is important, not just to check if any issue with listener is possibly just with TCP/IP protocol, but for RAC systems is a requirement to have IPC listed first. See Document:810394.1 RAC and Oracle Clusterware Best Practices and Starter Kit

3.Pmap of TNSLSNR process. Checks if there is memory leak of the process. Take PMAP commands before hang and leading upto the time of the hang, to give overall picture of the size of any leak. Remembering TNSLSNR will increase and decrease in size, as connections come in and are released.

4.Operating System error logs / Network information
Check the logs for issues leading up to and during the time of the hang. If the issue is a possible OS one, OS watcher can assist. Document:301137.1 Check third party tools like putty, are able to connect to the server, to confirm connectivity to machine is good.

Proactive Advise
  • Listener Log
Ensure the listener log file size is not large. Older operating systems had issues with 2gb file sizes, but with newer system files sizes can grow very large. Oracle Net auditing requires connection information to be written to the bottom of the text listener log file. Even with ADR from 11g onwards, text and xml versions of the listener log files are used by default. Version 11g, (11.1)
xml cycles log.xml every 10mb, but the text version of the listener log file, does not have this feature. Turning logging off, disables both xml and text logs files from being written. Setup job to cycle the listener log file every so often. See Document:135063.1
  • Secondary TNS listener and Oracle version
Configure a second TNS listener to run on the system and setup clients todo connection time failover (CTF) and failover (TAF). This could ensure any outage for the primary listener, would not effect new and existing connections to a degree. Also helps to check if the problem is one maybe related to TCP/IP port number. In the case of port probing software checking on TCP/IP port numbers or a firewall (Window systems in particular)
If the oracle version in use is not the latest version, then to exclude known problems and / or test if the latest version of oracle is stable, running a secondary listener from a second oracle home is possible. ie primary oracle home 11.2.0.1, secondary oracle home 11.2.0.3. Run a TNS listener for the database in the primary oracle home from the 11.2.0.3 home. Both homes can in fact support a listener, as long as the listener name, IPC key value and TCP port numbers are unique.
  • Script. to restart TNS listener.
Script. could be used to restart the TNS listener, if there is no response from a listener control command or a TNSping of the listener. Logic would follow something like:
tnsping listener
if return code is then
do nothing and tnsping the listener later.
else
kill process and restart listener
resume cron job
end.
From 11.2 onwards Oracle restart could be used. Windows service has option to restart if it fails.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22308399/viewspace-750806/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/22308399/viewspace-750806/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值