Applies to:
Oracle Net Services - Version: 9.2.0.1 and later [Release: 9.2 and later ]Information in this document applies to any platform.
Goal
To be used to help diagnose and offer possible alternate setup or workarounds to issues with TNS listener processSolution
TNS listener HangsFollowing are key questions to understand setup, background and history of the problem.These help find possible alternate setup's, which might give relief or ease the pressure on the problem:
- How many times has the hang been seen?
- How often is the hang happening, once a day, once a week, etc?
- Is there any pattern to the issue, such as date, time, peak load?
- When was the issue first seen or recorded and was there any changes to environment around this time, Oracle, Operating system, Hardware or Network?
- What is the 5 digit version of Oracle binaries? Important to exclude any know problems, and advise on alternate setup.
- Issue seen for more than one version of Oracle?
- Issue seen for other TNS listeners, be that on same system or other systems and platforms?
- Listener log.Check the log file size and for information like incoming connection rate, TNS errors, TNS ping commands, leading up to time of then hang. As a side note, stop any script. / job which is running continuous TNSping commands against the listener. These can slow down connections to the listener.
- Check operating system error logs for issues seen at the same time and leading up to a recorded hang.
- Check on operating system jobs or programs running, clearing up old files or removing long running processes.
- Check listener.ora / sqlnet.ora file for parameters used, including those which do not have default values. Test a listener on the same system, with default settings used, is the issue still seen?
1.Enable level 16 listener trace and capture a couple of hangs. Add to LISTENER.ORA file
DIAG_ADR_ENABLED_ =off # Disable ADR if version 11g
TRACE_LEVEL_ = 16 # Enable level 16 trace
TRACE_TIMESTAMP_ = ON # Set timestamp in the trace files
TRACE_DIRECTORY_ = # Control trace file location
TRACE_FILELEN_ = # Control size of trace files. i
TRACE_FILENO_ = # Number of trace files per process
TRACE_LEVEL_ = 16 # Enable level 16 trace
TRACE_TIMESTAMP_ = ON # Set timestamp in the trace files
TRACE_DIRECTORY_ = # Control trace file location
TRACE_FILELEN_ = # Control size of trace files. i
TRACE_FILENO_ = # Number of trace files per process
The TRACE_FILELEN parameter is for the size of a trace file.
The TRACE_FILENO parameter is the number of traces per process.
If cyclic tracing setup, ensure the size selected, is large enough so that trace capture data before a hang. Important Note: Ensure traces are copied off before listener is restarted, otherwise trace data will be lost.Reload of the TNS listener is required for tracing to start
Tracing needs to be enabled before any issue happens. Gathering of multiple traces, if possible, will help confirm the hang is always on or around the same functions / area. Using cyclic tracing will help limit size of trace output. A further option that maybe used is, If able to predict when the next hang will happen, follow Note:751432.1Further TNS listener Tracing.This will again, help limit amount of time tracing is enabled.
Then at the time of a hang...
2.Truss with time delta and Pstack the hanging TNSLSNR process.
If possible run the commands a couple of times, a few minutes a part. This helps to confirm if the problem is in fact a hung process or just a very slow one, ie if the function output changes between 1st and 2nd run of the commands.
Remember if the event takes a long time reproduce, then having truss on the listener process is going to produce large amount of information. Only follow this step for the primary listener, if able to predict when a hang may occur.
3.Check listener control (LSNRCTL) can connect and check the status of the TNSLSNR process. Gather screen output of lsnrctl status/services
Command confirms if the TNSLSNR process is still responding to none remote connections and helps show the status of the registered database services. Possibly excludes TCP/IP as an issue, depending on listener.ora file setup.
A secondary check here, is to ensure the first address in the description section of the listener.ora file, is IPC protocol. Example:
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = Key1))
(ADDRESS = (PROTOCOL = TCP)(HOST = sample.oracle.com)(PORT = 1521))
)
)
This will ensure IPC is used between LSNRCTL and TNSLNR. This is important, not just to check if any issue with listener is possibly just with TCP/IP protocol, but for RAC systems is a requirement to have IPC listed first. See
Document:810394.1 RAC and Oracle Clusterware Best Practices and Starter Kit
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = Key1))
(ADDRESS = (PROTOCOL = TCP)(HOST = sample.oracle.com)(PORT = 1521))
)
)
3.Pmap of TNSLSNR process. Checks if there is memory leak of the process. Take PMAP commands before hang and leading upto the time of the hang, to give overall picture of the size of any leak. Remembering TNSLSNR will increase and decrease in size, as connections come in and are released.
4.Operating System error logs / Network information
Check the logs for issues leading up to and during the time of the hang. If the issue is a possible OS one, OS watcher can assist. Document:301137.1 Check third party tools like putty, are able to connect to the server, to confirm connectivity to machine is good.
Proactive Advise
- Listener Log
xml cycles log.xml every 10mb, but the text version of the listener log file, does not have this feature. Turning logging off, disables both xml and text logs files from being written. Setup job to cycle the listener log file every so often. See Document:135063.1
- Secondary TNS listener and Oracle version
If the oracle version in use is not the latest version, then to exclude known problems and / or test if the latest version of oracle is stable, running a secondary listener from a second oracle home is possible. ie primary oracle home 11.2.0.1, secondary oracle home 11.2.0.3. Run a TNS listener for the database in the primary oracle home from the 11.2.0.3 home. Both homes can in fact support a listener, as long as the listener name, IPC key value and TCP port numbers are unique.
- Script. to restart TNS listener.
tnsping listener
if return code is then
do nothing and tnsping the listener later.
else
kill process and restart listener
resume cron job
end.
From 11.2 onwards Oracle restart could be used. Windows service has option to restart if it fails.
if return code is then
do nothing and tnsping the listener later.
else
kill process and restart listener
resume cron job
end.
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22308399/viewspace-750806/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/22308399/viewspace-750806/