LTOM - The On-Board Monitor User Guide (文档 ID 352363.1)

LTOM

 

The On-Board Monitor User's Guide

Embedded Real-Time Data Collection and Diagnostics Platform

Carl Davis
Center of Expertise
December 9, 2010


Best Practices

Pro-Active Problem Avoidance and Diagnostic Collection

Although some problems may be unforeseen, in many cases problems may be avoidable if signs are detected early enough. Additionally, if an issue does occur, it is no use collecting information about that issue after the event. LTOM is one of the tools that support recommend for collecting such diagnostics. For information on suggested uses, other proactive preparations and diagnostics, see:

Document 1482811.1 Best Practices: Proactively Avoiding Database and Query Performance Issues
Document 1477599.1 Best Practices Around Data Collection For Performance Issues

LTOM now provides a graphing utility to graph the data collected. This greatly reduces the need to manually inspect all the output files. See the "LTOMg System Profiler" section below. Click here to see an example of the new html system profile output.

Contents

 

Introduction

The Lite Onboard Monitor (LTOM) is a java program designed as a real-time diagnostic platform for deployment to a customer site. LTOM differs from other support tools, as it is proactive rather than reactive. LTOM provides real-time automatic problem detection and data collection. LTOM runs on the customer's UNIX server, is tightly integrated with the host operating system and provides an integrated solution for detecting and collecting trace files for system performance issues. The ability to detect problems and collect data in real-time will hopefully reduce the amount of time it takes to solve problems and reduce customer downtime.

Back to Contents

Overview

Historically, one of the major problems with obtaining the necessary diagnostic information to diagnose database/system performance problems is having the necessary diagnostic data collected while the problem is actually occurring. Additionally, the necessary diagnostic data is seldom collected because of the time it takes to react to recognizing there is a problem, trying to determine what kind of data to collect, and knowing how to collect the data. Frequently, the problem has passed or the database has to be shutdown to correct the problem. This forces the customer to wait until the next occurrence and then hopefully the data can be collected fast enough. LTOM does automatic problem detection and collects the necessary diagnostic traces in real-time while the database/system performance problem is occurring. LTOM provides services for:

  • Automatic Hang Detection

  • System Profiler

  • Automatic Session Tracing

Back to Contents

New Features

Version 4.3.1 of LTOM contains the following new features:

  • System Profiler has been enhanced to collect additional metrics for parallel query slaves and blocking sessions.

  • LTOMg had new functionality to format System Profiler trace to provide easier readability.

Back to Contents

Support for RAC

LTOM can be configured for use in a RAC environment. See instructions in the $TOM_HOME/init/hangDetect.properties file for details. To use automatic hang detection, LTOM needs to be installed on only 1 node of the RAC cluster.

To use all the other features of LTOM, such as the System Profiler or Session Recorder,  LTOM must be installed on each node of the RAC cluster. For shared disk environments, install LTOM to a unique location for each node of the cluster. It is also recommended that OSWatcher (See Note:301137.1) be installed on each node of a RAC cluster.

Back to Contents

Automatic Hang Detection

This feature should only be used at the direction of Oracle Support or by experienced dba's. The automated collection of heavy tracing  on a production system can have a significant performance impact on that system. The user needs to be aware of the consequences of generating this level of tracing and should proceed with caution.

Automatic Hang Detection uses a rule based hang detection algorithm. LTOM has a default built in set of rules that should be sufficient in most situations but provides the ability to modify or add new rules as needed. These rules are based on database wait events. LTOM considers only non-idle wait events in its hang detection algorithm. To provide more granularity, a set of rules can be configured to match specific kinds of hangs. For example, if hangs are occurring because of latch free waits that happen very quickly, hanging the system for a short duration (several minutes), and the default trigger value for latch free is set too high, we can define a rule for latch free that triggers on 15 seconds. Any session waiting on latch free for a period greater than 15 seconds would then trigger the collection of diagnostic hang traces.

When operating LTOM and this mode is enabled, automatic hang detection proceeds silently in the background while periodically checking for hangs. Once any session has been identified as hung, diagnostic traces are automatically generated. The type of hang diagnostic and number of diagnostic traces collected is determined by what has been defined in the rules file, $TOM_HOME/init/hangDetect.properties. The default collection is as follows...

  • HangAnalyze Level 3

  • Systemstate Level 266

  • Wait 60 seconds

  • HangAnalyze Level 3

  • Systemstate Level 266

To modify this collection edit the $TOM_HOME/init/hangDetect.properties file.

The advantage of using automatic hang detection, is that if the database hangs at 2:00 in the morning and no one is around, the necessary diagnostic traces  will be collected and a hang report will be generated. Email notification can be configured that will alert the user to the hang. To set up email notification, edit the $TOM_HOME/src/ltommail.sh file or simply allow the auto installer to do this for you on installation. To prevent traces from constantly being generated once a hang is detected, only one set of diagnostic traces are collected and no further hangs will be detected until the mode has been turned off and re-enabled. LTOM can also automatically determine the level of tracing based on the level of impact to the system of collecting additional diagnostic traces.

For more information see the Automatic Hang Detection FAQ below and also the rule definition file $TOM_HOME/init/hangDetect.properties.

Back to Contents

System Profiler

One of the problems with relying solely on statspack, is the inability to look at performance from a holistic point of view. Information about non-Oracle processes and the health of the operating system in terms of memory, CPU and IO for example, is not collected. Further, all static data collectors are problematic in that single sample snapshots or multiple snapshots taken at 15 or 30 minute intervals can miss problems which can occur briefly during a snapshot interval and will be averaged out over the duration of the snapshot. The System Profiler provides the ability to continually collect data from both the operating system and oracle and provides an integrated snapshot of the overall health of the operating system together with the database. This data collection contains the output from operating system utilities (top, vmstat and iostat) along with Oracle session data (v$session, v$process, v$sesson_wait, v$system_event and v$system_statistics). The recording frequency and subsets of available data can also be configured when running the tool.

Once the data is collected, the data can be parsed and analyzed through LTOMg. This tool provides a graphical interface and can quickly drill down around any performance problem.  Click here to see an example of the new html system profile output.

The following parameters can be configured to control the frequency and selectivity of data to be collected.

  • Update Freq - latency between snapshots

  • Display Top - select to record OS top processes

  • Display Vmstat - select to record vmstat information

  • Display Iostat - select to record iostat information

  • Display Sessions - select to record Oracle processes

  • Display CPU Stats - select to record CPU statistics from OS

  • Display Current SQL Executing - select to record current SQL executing

For more information see the System Profiler FAQ below.

Also Appendix D contains an example of the raw System Profiler data collection.

Back to Contents

Automatic Session Tracing

This feature should only be used at the direction of Oracle Support or by experienced dba's. The automated collection of heavy tracing  on a production system can have a significant performance impact on that system. The user needs to be aware of the consequences of generating this level of tracing and should proceed with caution.

One of the most important diagnostic traces is the Oracle extended SQL trace, commonly known as SQL trace. Obtaining a SQL trace file from oracle database sessions can be problematic, especially if you do not know which session you need to trace. Likewise, turning on SQL trace for the entire database, just to capture the trace of a few problematic sessions can be prohibitively expensive for some customers.  Automatic Session Tracing uses a set of rules to determine when to turn on SQL trace for individual oracle sessions, using event 10046 level 12 trace. Rules can be defined for database wait events, CPU and specific users. For rules based on wait events, the automatic session recorder monitors certain V$ views at specified intervals and computes the average wait time between intervals for each event.  This computed average wait time is compared to the rule definition for that event, if any. If a rule has been defined for that event and if the average wait time exceeds the rule threshold for that event then LTOM turns on tracing for that session. For rules based on CPU, the automatic session recorder computes the amount of CPU used by the session between intervals and compares it to the rule. For rules based on specific users, the automatic session recorder traces any session owned by that user. Sessions can be traced in a circular memory buffer or to a file.

The advantage of tracing in a memory buffer is that the process is not constantly writing to a disk, as I/O is one of the most expensive operations a computer performs. Memory tracing is also advantageous in that only the last few seconds of tracing that are close to the performance problem is generated avoiding the collection of gigabytes of trace data just for the last few seconds of trace. The rule definition for in memory tracing uses 2 thresholds. The minimum threshold turns on the tracing for the session in memory and the maximum threshold forces the memory buffer to be written to disk to that session's respective trace file. This allows the session to be continuously traced and dumped to disk only when something significant occurs. The user can also manually force the memory buffer to be written to disk at any time. The user specifies the amount of memory to dedicate to each session when starting LTOM along with the option to limit the number of sessions LTOM can trace.

When tracing directly to a file, automatic session tracing simply turns on tracing automatically for any session which violates the rule definitions. When exiting automatic session tracing all sessions currently being traced will have their respective tracing turned off. As a fail safe, LTOM creates a SQL script file in the $TOM_HOME/recordings/session/stopsessions.sql which will turn off any tracing turned on by LTOM. The user would manually run this file if required.

For more information see the Automatic Session Recorder FAQ below and also the rule definition file  $TOM_HOME/init/sessionRecorder.properties.

Also Appendix C contains an example of how to set up Automatic Session Tracing.

Back to Contents

Supported Platforms

  • Solaris

  • Linux

  • HP-UX

  • AIX

  • Tru64

Back to Contents

Download LTOM

Current LTOM Version: 4.3.1   December, 2010

Click here to download the file.

If a file download dialog box does not appear when clicking on the above link, you may need to clear your web browser's cache and/or restart your web browser. If you are still unable to download the file, you may request that we email you a copy: Carl.Davis@oracle.com

Back to Contents

Installing LTOM

Download of LTOM is available through MetaLink and can be downloaded as a tar file. Copy the tar file to the directory where LTOM is to be installed and issue the following commands.

uncompress ltom.tar.Z
tar xvfp ltom.tar

A directory named tom_base is created which houses all the files associated with LTOM. A README file is located in this directory with full instructions on how to install the tool.

Back to Contents

Uninstalling LTOM

To uninstall LTOM issue the following command on the tom_base directory

rm -rf tom_base

Back to Contents

Running LTOM

The user of LTOM must be a member of the unix dba group as some components of LTOM use OS authentication. In addition, LTOM will prompt for a db username/password. This is required for LTOM to make jdbc connections to the database. This user must be a db user with full dba privileges.

Before running LTOM certain environment variables need to be defined. Please see the README for further details.

To run LTOM standalone go to the directory tom_base/tom ($TOM_HOME) and issue the following command...

./startltom.sh

This will bring up the command line version of LTOM. Users are first prompted to enter a database username and password. The username must be a db user with dba privileges. Once logged in, users can then manually enter commands to turn on and off automatic hang detection and data recording functions.

kernaltom:/u02/home/TOM>./startltom.sh
Enter 1 to Start Auto Hang Detection
Enter 2 to Stop Auto Hang Detection
Enter 3 to Start System Profiling
Enter 4 to Stop System Profiling
Enter 7 to Start Session Tracing
Enter 71 to Display Sessions Traced
Enter 72 to Dump All Trace Buffers
Enter 73 to Dump Specific Trace Buffer
Enter 8 to Stop Session Tracing

Enter S to Update status
Enter Q to End Program
CURRENT STATUS: HangDetection=OFF ManRec=OFF SessionRec=OFF
Please Select an Option:

LTOM can also be started as a background task. See instructions in the README There can be no interaction with LTOM once it is started in this mode. To terminate LTOM follow the instructions in the README.

kernaltom:/u02/home/TOM>nohup ./startltom.sh -s &

Back to Contents

LTOMg System Profiler

A new utility, LTOMg has been added to LTOM. This utility provides the ability to graph the data collected by LTOM. See the LTOMg User Guide for more information.    To see a sample of the LTOMg System Profiler output, click here.

Sample Graph

Back to Contents

Reporting Feedback

If you encounter problems running LTOM or would like to provide feedback, please send email to Carl.Davis@oracle.com.

Back to Contents

Appendix A: LTOM Directory Structure

 

 

directory

 

The tom_base directory is the root directory created when downloading and untarring the ltom.tar file. The tom_base directory contains 2 subdirectories

  • TOM_HOME - root directory for all LTOM subdirectories.

  • Install - directory containing the installer

The TOM_HOME directory contains the following 6 subdirectories:

  • hanglog directory contains the logs created from running Automatic Hang Detection.

  • The init directory contains the following initialization files...

    • tom_deploy.properties -  this file contains initialization parameters for LTOM. This file is mandatory for startup of these tools.

    • hangDetect.properties - this file contains the rule definitions for Automatic Hang Detection. This file should not be edited unless directed by a support analyst.

    • sessionRecorder.properties - this file contains the rule definitions for Automatic Session Tracing. This file should not be edited unless directed by a support analyst.

  • The ltomg directory contains 3 subdirectories

    • src - directory for ltomg source files.

    • gif - default directory for ltomg gif files.

    • profile - default directory for ltomg html profiles.

  • The recordings directory contains 4 subdirectories

    • event - directory containing the event rule violations for Automatic Data Recorder (desupported)

    • profile - directory containing the files from the System Profiler

    • smart - directory containing the files created from running the default event toolkits for the Automatic Data Recorder (desupported)

    • session - directory containing the log from Automatic Session Tracing

  • The src directory contains LTOM external source files. The directory also contains the LTOM executable

  • The tmp directory contains temporary files used by LTOM.

Back to Contents

Appendix B: LTOM Rules of Engagement/FAQ

System Profiler

When to use?

The system profiler should be used whenever a comprehensive view of a performance problem is required. The system profiler should be considered for any performance problems that require analysis down to the seconds level. This option should be considered whenever statspack snapshots do not provide the granularity necessary to resolve the issue. The system profiler is useful to frame performance issues where a bottleneck may be outside Oracle.

Benefits?

  • Collect data up to just seconds prior to hang or crash

  • Collect os data in additional to oracle performance data

  • Collect statistical data down to 1 second increments

  • Displays SQL currently executing

  • RCA timeline

How to use?

  • Install LTOM

  • cd $TOM_HOME

  • /startltom.sh

  • Select option 3. Then follow prompts

Where is the output?

The system profiler produces a single log file for each recording. An additional io file may be created if profiling with iostat. These file is are located in the $TOM_HOME/recordings/profile directory.

Gotchas?

The system profiler produces a single file each time it is turned on. If left on for days this file could become quite large. It is recommended that the recorder be reset on a daily basis if extended recording is required.

Back to Contents

Automatic Hang Detection

When to use?

This feature should only be used at the direction of Oracle Support or by experienced dba's. The automated collection of heavy tracing  on a production system can have a significant performance impact on that system. The user needs to be aware of the consequences of generating this level of tracing and should proceed with caution.

Automatic hang detection should be considered for any tars involving hangs/slowdowns when the necessary information collected at the initial outage is insufficient to diagnose the problem. If hang occurs at 2:00 in the morning and no one is around LTOM will collect required trace files.

Benefits?

  • Collect systemstates and hanganalyze files during the actual hang without operator intervention

  • Hang data collection 24x7

  • Hangs automatically detected

  • Email notification of hang

How to use?

  • Install LTOM

  • Edit the file $TOM_HOME/init/hangDetect.properties if you want to customize

  • cd $TOM_HOME

  • ./startltom.sh

  • Select option 1. Then follow prompts

Where is the output?

Automatic hang detection produces several files for each hang.  These files are as follows:

  • $TOM_HOME/hanglog/ hang*.log file containing the systemstate analyzer output and hang analyze summary output.

  • $TOM_HOME/hanglog/hang*.report  file gives details about what caused the hang and records the actions LTOM has taken once the hang was detected.

  • Systemstate dumps and hang analyze files produced are in the udump.

  • Email notification if this was configured (see README).

Gotchas?

Once a hang is detected, automatic hang detection produces one set of files. The program needs to be reset before the next set of files is collected. This is to prevent the continuous, indefinite collection of systemstate/hanganalyze files.

Back to Contents

Automatic Session Tracing

When to use?

This feature should only be used at the direction of Oracle Support or by experienced dba's. The automated collection of heavy tracing  on a production system can have a significant performance impact on that system. The user needs to be aware of the consequences of generating this level of tracing and should proceed with caution.

Automatic session tracing should be considered for situations where specific sessions experience performance problems. Data collection can be tied to specific wait events or CPU. An example may be latch contention that happens occasionally but is not detected by statspack snapshots.

Benefits?

  • Collect 10046 trace only when a performance problem occurs

  • Collect SQL associated with a session's performance problem

  • Tie data collection to a specific oracle wait event or CPU utilization

  • Easily identify the users and SQL associated with a particular performance problem

  • Session tracing for only problematic sessions

  • Trace sessions owned by a specific user

How to use?

  • Install LTOM

  • Edit the file $TOM_HOME/init/sessionRecorder.properties file

  • Define a rule based on db wait event (See README for full details)

  • cd $TOM_HOME

  • ./startltom.sh

  • Select option 7. Then follow prompts

Where is the output?

Automatic Session Tracing produces oracle trace files and a log file. These files are in the following directories:

  • $TOM_HOME/recordings/session directory contains a file logging any significant performance events that occur during the recording.

  • Oracle session trace files located in bdump and udump

Gotchas?

It can take up to 3 times the polling frequency for the following values to be displayed back to the user properly. This is due to the polling frequency which basically causes the program to sleep between sampling and also because we are collecting multiple samples and  performing computations.

The user must exit automatic session tracing to turn off tracing once it has been started thru LTOM. The program turns off tracing for all sessions that it enabled tracing for. Failure to exit automatic session tracing will result in these sessions continuing to be traced.

Back to Contents

Appendix C: Automatic Session Tracing Example

The problem:

A business has an SLA (Service Level Agreement) with their customers that require all customer transactions to complete in under one second. Occasionally, some transactions exceed this requirement forcing the business to incur a significant financial penalty. By deploying the system profiler and taking snapshots of the system every few seconds it was discovered that a particular wait event was responsible for this excessive time causing the transaction not to complete in under 1 second. Knowing just the wait event did not provide enough information to determine why this was happening. What was needed was a 10046 trace of the session(s) involved so the underlying SQL could be examined. The business did not know which of the 1000 concurrent sessions to trace nor could they afford to trace all 1000 sessions as this would force most of the other transactions beyond the 1 second SLA because of the significant overhead of tracing all sessions with the 10046 event.

The solution:

By deploying the automatic session recorder it is possible to trace only those sessions that are being affected by the particular wait event resulting in insignificant performance impact to the database. The performance impact of taking these diagnostics traces could be even further reduced by tracing these sessions in memory and not having them write to a trace file. Sessions can be monitored in memory through LTOM and only written out to a file when that session's wait exceeds some threshold value.

Step 1. Configure the recorder

The session recorder can be configured to either trace to a file or to a memory buffer. To configure the recorder, edit the $TOM_HOME/init/sessionRecorder.properties file. A new rule needs to be added and defined that will trace any session waiting on the particular wait event. A rule can be defined that will either trace directly to a file or trace to a circular memory buffer once a minimum threshold value has been exceeded.  The memory contents will be dumped to a file once a second threshold value has been exceeded. In this example the following line will be added to the properties file to define a rule for tracing sessions waiting on "global cache cr request" to memory...

EVENT=global cache cr request, VALUE=5, 100

What is important to note is that the event name defined in the rule must be exactly the same name as the name column from v$event_name. Also note two values have been specified. The first value (5), specifies a minimum threshold value, in centiseconds, to turn on the 10046 trace. In this case, any session that waits on "global cache cr request"  for a period of .05 seconds during the sampling interval will have it's session traced in memory.  The second value (100), specifies a maximum threshold value, in centiseconds,  to dump the contents of the memory buffer to a file. This means that tracing will be started in memory once any session waits on "global cache cr request" for a period of .05 seconds and will continue to be traced indefinitely until it has been turned off manually or the session terminates for whatever reason. Once the maximum threshold value, in this case 1.0 seconds is exceeded, the entire contents of the trace buffer is dumped from memory to that sessions respective trace file in the udump/bdump.

Step 2. Turn on session tracing through LTOM

Start LTOM and login.

Select option 7 to begin session recording. You will then be asked to respond to the following prompts...

Enter a polling frequency in seconds. This is the sampling interval LTOM uses to check if the threshold values specified in the rules get violated. A recommended sampling frequency would be 5 seconds.

Trace sessions to memory or file. Although a rule has already been defined that will trace to memory you can always override it here. Specify M to trace to memory.

Enter amount of memory for each trace buffer in bytes. Whatever value you enter here will be multiplied by the number of sessions that are actually traced. You should consider how much free memory you have on your system. A recommended amount would be 50000 bytes.

Enter max processes to trace. This serves as a safety valve. In this example we have 1000 concurrent sessions. Unless we limit the number of sessions being traced in theory we could get all 1000 sessions being traced each consuming, in this example, 50,000 bytes. It is recommended that the number of sessions traced be limited to a reasonable value as to prevent something unexpected from happening. A recommended amount would be 5-10 sessions.

Step 3. Monitor/Control session tracing through LTOM

Please Note: It can take up to 3 times the polling frequency for the following values to be displayed back to the user properly. This is due to the polling frequency which basically causes the program to sleep between sampling and also because we are collecting multiple samples and  performing computations.

Select option 71 to display current sessions being traced.

Select option 72 to manually force all sessions traced in memory to their respective trace files in the udump/bdump.

Select option 73 to manually force a particular session's trace in memory to its respective trace file in the udump/bdump.

Select option 74 to stop a specific session from being traced. It is important to note if you disable a session from being traced that session can no longer have tracing enabled and to re-enable that session tracing you would need to stop all session tracing with option 8 and then restart the session recorder with option 7.

Step 4. Stop the Session Recorder

Select option 8 to stop the session recorder. This option disables any tracing turned on by LTOM.

Step 5. Review the 10046 trace

Each session that was traced through LTOM has produced a trace file in the udump/bdump.

Step 6. Send 10046 trace files to support

Back to Contents

Appendix D: Sample System Profiler File

LTOM Version=4.1.2
HOSTNAME=coehq2
HOSTOS=SunOS
DB_VERSION=9.2.0.1.0
CPU_COUNT=2
PHYSICAL_MEMORY=1024000000

######################################################################
# Copyright (c) 2008 by Oracle Corporation
# LTOM REPORT V4.1.1

# This report is generated by running the System Profiler option of 
# LTOM. As this report is configurable at runtime some sections of this
# report may be missing if the option was not selected by the user.
# This report looks best if viewed in 132 column mode.
# The following sections repeat for each snapshot interval N...
#
######################################################################

---------------SNAPSHOT# N
system timestamp

---------------VMSTAT:---

current vmstat snapshot from unix vmstat utility

---------------OS TOP CPU PROCESSES:---

current top os processes from unix top utility

---------------ORACLE SESSIONS:---

current oracle session and process information

SID           V$session.sid 
PID           v$process.pid 
SPID          v$process.spid 
%CPU          %cpu from os 
TCPU          total cpu in seconds from os 
MCPU          v$sesstat.CPU used by this session 
             (in 10s of milliseconds. value is the delta value between snapshot) 
PROGRAM       v$process.program 
USERNAME      v$session.username 
EVENT         v$session_wait.event 
SEQ           v$session_wait.seq# 
SECS          v$session_wait.seconds_in_wait 
WAIT_TIME     v$session_wait.wait_time 
P1            v$session_wait.p1 
P2            v$session_wait.p2 
P3            v$session_wait.p3 
P1RAW         v$session_wait.p1raw 
P2RAW         v$session_wait.p2raw 
P3RAW         v$session_wait.p3raw 
HASH_VALUE    v$session.sql_hash_value 
SQL_ADDRESS   v$session.sql_address    
ET            v$session.last_call_et 
LOGICAL_READS v$sesstat.session logical reads 
USER_COMMITS  v$sesstat.user commits 
PGA           v$sesstat.session pga memory 
CALLS         v$sesstat.session user calls 
RSIZE         memory resident size from os 
VSIZE         memory virtual size from os 
PGA_ALLOC_MEM v$process.pga_alloc_mem 
DB_TIME_TOTAL v$sesstat.db_time (V10 only) 
CPU_TOTAL     v$sesstat.CPU used by this session 
DB_TIME       v$sesstat.db_time (V10 only) 
             (in 10s of milliseconds. value is the delta value between snapshot) 
MODULE        v$session.module

---------------CURRENT SQL EXECUTING:---

SID v$session.sid
HASH VALUE v$session.sql_hash_value
SQL_ADDRESS v$session.sql_address
LAST_CALL v$session.last_call_et
SQL_TEXT v$sqltext.sql_text

---------------SYSTEM STATISTICS:---

Values are delta values calculated between snapshots from
v$sysstat. Only non zero values are reported.

---------------AVERAGE SYSTEM WAITS IN HUNDREDTHS OF SECONDS:---

Values are delta values calculated between snapshots from
v$system_event. Only non zero values are reported.

---------------SYSTEM WAITS:---

Values are delta values calculated between snapshots from
v$system_event. Only non zero values are reported.

---------------SQL EXECUTED DURING THIS REPORT DURING SNAPSHOT:---

HASH VALUE v$session.sql_hash_value
SQL_ADDRESS v$session.sql_address
SQL_TEXT v$sqltext.sql_text

######################################################################
# REPORT BEGINS BELOW THIS LINE
######################################################################

---------------SNAPSHOT# 1
Tue Sept 25 16:00:00 EDT 2007

---------------VMSTAT:---

r b w   swap   free  re   mf pi po fr de sr dd dd f0 s0  in   sy   cs us sy id wa zy
1 0 0 665112 219424 201 1684  0  0  0  0  0  0  0  0  3 324 6229 1614 12 26 61 50 zz

---------------OS TOP CPU PROCESSES:---

load averages: 1.72, 2.04, 2.07 13:05:02
184 processes: 183 sleeping, 1 on cpu

Memory: 2048M real, 211M free, 1477M swap in use, 645M swap free

  
  PID USERNAME THR PRI NICE  SIZE   RES STATE  TIME   CPU COMMAND
26184 cedavis   16  18   10   51M   21M sleep  0:06 7.73% java
22050 cedavis   22  49    0  336M  264M sleep 92:11 3.62% java
26815 oracle     1  59    0    0K    0K sleep  0:00 1.73% oracle
  524 root       1  59    0   45M   75M sleep 19:56 0.94% Xsun
25549 oracle     1  48    0    0K    0K sleep  0:00 0.67% oracle
26816 cedavis    1   0   10 1632K 1072K   cpu  0:00 0.44% top
13212 cedavis   17  18   10   95M   50M sleep 17:56 0.28% java
  409 root       1  12    0 1128K  824K sleep 43:38 0.25% init.cssd

---------------ORACLE SESSIONS:---
SID PID SPID %CPU TCPU MCPU PROGRAM USERNAME EVENT SEQ SECS WAIT_TIME P1 P2 P3 HASH VALUE SQL_ADDRESS ET
124 37 13517 0.0 0:00 0 O000 null class slave wait 1 21846 0 0 00 0 00 0 00 0 00 21846
126 35 26815 1.8 0:00 0 UNKNOWN TOM SQL*Net message from client 32 0 2 1952673792 0000000074637000 1 0000000000000001 0 00 3796581998 000000039ACC0F20 0
129 29 11308 0.0 0:07 0 TNS SYS SQL*Net message from client 23 609345 0 1650815232 0000000062657100 1 0000000000000001 0 00 0 00 609345
130 33 18609 0.0 0:05 0 TNS SYS SQL*Net message from client 11837 3 0 1650815232 0000000062657100 1 0000000000000001 0 00 3364942409 00000003975D6AB8 3
131 32 18607 0.0 3:54 0 TNS SYS Streams AQ: waiting for messages in the queue 26216 7 0 9732 0000000000002604 15643062048 00000003A4662F20 10 000000000000000A 2346103937 000000039775BEB8 7
133 31 18652 0.0 0:00 0 TNS SYS SQL*Net message from client 16 851210 0 1650815232 0000000062657100 1 0000000000000001 0 00 0 00 851210
134 30 18567 0.0 1:09 0 TNS SYS SQL*Net message from client 32547 340 0 1650815232 0000000062657100 1 0000000000000001 0 00 0 00 340
135 28 22029 0.0 0:01 0 TNS CARL SQL*Net message from client 94 772200 0 1650815232 0000000062657100 1 0000000000000001 0 00 0 00 772200
137 26 21352 0.0 0:07 0 TNS SYS SQL*Net message from client 20 607909 0 1650815232 0000000062657100 1 0000000000000001 0 00 0 00 607909
138 34 11767 0.0 0:02 0 TNS SYS SQL*Net message from client 2033 1223 0 1650815232 0000000062657100 1 0000000000000001 0 00 0 00 1223
140 27 18519 0.0 0:00 0 q001 null Streams AQ: waiting for time management or cleanup tasks 1 851228 0 0 00 0 00 0 00 3393152264 00000003A4122A90 851228
144 25 18388 0.0 0:03 0 QMNC null Streams AQ: qmn coordinator idle wait 54536 16 0 0 00 0 00 0 00 0 00 851240
147 24 24006 0.0 0:12 0 TNS SYS enq: TM - contention 53 609402 0 1414332422 00000000544D0006 51578 000000000000C97A 0 00 3630001660 00000003975CD560 609402
148 23 25549 0.6 0:01 0 UNKNOWN TOM SQL*Net message from client 180 0 0 1952673792 0000000074637000 1 0000000000000001 0 00 0 00 0
149 22 18066 0.0 0:03 0 RBAL null rdbms ipc message 11 502140 0 300 000000000000012C 0 00 0 00 0 00 851264
150 21 18056 0.0 0:45 0 ASMB null ASM background timer 3 851261 0 0 00 0 00 0 00 0 00 851264
151 20 18513 0.0 0:01 0 q000 null Streams AQ: qmn slave idle wait 1 851228 0 0 00 0 00 0 00 0 00 851228
154 19 18043 0.0 2:20 0 LCK0 null rdbms ipc message 16517 3 0 300 000000000000012C 0 00 0 00 0 00 851267
155 16 18008 0.1 10:06 0 MMNL null rdbms ipc message 21 332994 0 100 0000000000000064 0 00 0 00 0 00 851275
156 15 18006 0.0 3:38 0 MMON null rdbms ipc message 46461 19 0 300 000000000000012C 0 00 0 00 3393152264 00000003A4122A90 851275
157 14 17998 0.0 5:19 0 CJQ0 null rdbms ipc message 25290 0 0 175 00000000000000AF 0 00 0 00 0 00 851275
158 13 17996 0.0 0:00 0 RECO null rdbms ipc message 15 66401 0 180000 000000000002BF20 0 00 0 00 0 00 851275
159 12 17993 0.0 5:00 0 SMON null smon timer 4934 27848 0 300 000000000000012C 0 00 0 00 0 00 851275
161 11 17987 0.1 7:24 0 CKPT null rdbms ipc message 43428 3 0 300 000000000000012C 0 00 0 00 0 00 851275
162 10 17985 0.0 0:33 0 LGWR null rdbms ipc message 62412 34 0 300 000000000000012C 0 00 0 00 0 00 851275
163 9 17980 0.0 2:01 0 DBW0 null rdbms ipc message 17491 34 0 300 000000000000012C 0 00 0 00 0 00 851275
164 8 17977 0.0 0:06 0 MMAN null rdbms ipc message 16 831704 0 300 000000000000012C 0 00 0 00 0 00 851275
165 7 17969 0.0 0:45 0 LMS0 null gcs remote message 6 851267 0 24 0000000000000018 0 00 0 00 0 00 851275
166 6 17964 0.0 0:43 0 LMD0 null ges remote message 4 851267 0 64 0000000000000040 0 00 0 00 0 00 851275
167 5 17961 0.0 1:49 0 LMON null rdbms ipc message 2771 0 0 10 000000000000000A 0 00 0 00 0 00 851275
168 4 17953 0.0 0:08 0 PSP0 null rdbms ipc message 1992 145 0 300 000000000000012C 0 00 0 00 0 00 851275
169 3 17951 0.0 0:23 0 DIAG null DIAG idle wait 1 851275 0 1 0000000000000001 1 0000000000000001 200 00000000000000C8 0 00 851275
170 2 17948 0.0 2:06 0 PMON null pmon timer 7 851246 0 300 000000000000012C 0 00 0 00 0 00 851275

Session Wait query's elapsed time was= 226 msec

---------------SYSTEM STATISTICS:---
CPU used by this session= + 19
CPU used when call started= + 18
DB time= + 1019
DBWR checkpoint buffers written= + 242
DBWR transaction table writes= + 8
DBWR undo block writes= + 50
SQL*Net roundtrips to/from client= + 51
application wait time= + 1364
background timeouts= + 64
buffer is not pinned count= + 28
buffer is pinned count= + 1
bytes received via SQL*Net from client= + 11045
bytes sent via SQL*Net to client= + 17881
calls to get snapshot scn: kcmgss= + 38
cluster key scan block gets= + 8
cluster key scans= + 8
consistent gets= + 71
consistent gets - examination= + 33
consistent gets from cache= + 71
enqueue conversions= + 11
enqueue releases= + 437
enqueue requests= + 437
execute count= + 29
gc CPU used by this session= + 1
global enqueue CPU used by this session= + 1
global enqueue gets sync= + 31
global enqueue releases= + 31
index fetch by key= + 12
index scans kdiixs1= + 13
messages received= + 177
messages sent= + 177
no work - consistent read gets= + 19
opened cursors cumulative= + 18
opened cursors current= + 1
parse count (hard)= + 1
parse count (total)= + 20
parse time cpu= + 5
parse time elapsed= + 5
physical read total IO requests= + 13
physical read total bytes= + 212992
physical write IO requests= + 176
physical write bytes= + 1982464
physical write total IO requests= + 185
physical write total bytes= + 2079232
physical write total multi block requests= + 28
physical writes= + 242
physical writes from cache= + 242
physical writes non checkpoint= + 33
recursive calls= + 272
recursive cpu usage= + 5
redo blocks written= + 29
redo entries= + 176
redo size= + 13200
redo synch writes= + 14
redo wastage= + 1104
redo write time= + 1
redo writes= + 4
rows fetched via callback= + 3
session cursor cache count= + 3
session cursor cache hits= + 8
session logical reads= + 71
session pga memory max= + 131072
session uga memory= + 65408
session uga memory max= + 65408
shared hash latch upgrades - no wait= + 13
sorts (memory)= + 22
sorts (rows)= + 376
table fetch by rowid= + 10
table scan blocks gotten= + 3
table scan rows gotten= + 3
table scans (short tables)= + 3
user calls= + 51
user rollbacks= + 2
workarea executions - optimal= + 14

System Statistics query's elapsed time was= 16 msec

---------------AVERAGE SYSTEM WAITS IN HUNDREDTHS OF SECONDS:---
ASM background timer Average Wait = 489.0
DIAG idle wait Average Wait = 20.0
KJC: Wait for msg sends to complete Average Wait = 20.0
PX Deq: Execute Reply Average Wait = 8.0
PX Deq: Execution Msg Average Wait = 170.0
PX Idle Wait Average Wait = 244.0
SQL*Net message from client Average Wait = 69.0
Streams AQ: qmn coordinator idle wait Average Wait = 1392.0
Streams AQ: qmn slave idle wait Average Wait = 2786.0
Streams AQ: waiting for messages in the queue Average Wait = 977.0
class slave wait Average Wait = 11718.0
control file parallel write Average Wait = 2.0
control file sequential read Average Wait = 1.0
direct path read temp Average Wait = 5.0
direct path write temp Average Wait = 5.0
dispatcher timer Average Wait = 5860.0
gcs remote message Average Wait = 3.0
ges remote message Average Wait = 7.0
lms flush message acks Average Wait = 9.0
pmon timer Average Wait = 259.0
rdbms ipc message Average Wait = 73.0
reliable message Average Wait = 5.0
smon timer Average Wait = 4379.0
virtual circuit status Average Wait = 2931.0

Average System Waits query's elapsed time was= 58 msec

---------------SYSTEM WAITS:---
ASM background timer= + 2
CGS wait for IPC msg= + 125
DIAG idle wait= + 64
SQL*Net break/reset to client= + 2
SQL*Net message from client= + 48
SQL*Net message to client= + 48
SQL*Net more data from client= + 3
Streams AQ: RAC qmn coordinator idle wait= + 2
Streams AQ: qmn coordinator idle wait= + 2
Streams AQ: qmn slave idle wait= + 1
Streams AQ: waiting for messages in the queue= + 1
control file parallel write= + 5
control file sequential read= + 13
db file parallel write= + 176
enq: TM - contention= + 27
gcs remote message= + 360
ges remote message= + 151
ksxr poll remote instances= + 12
log file parallel write= + 4
pmon timer= + 5
rdbms ipc message= + 191
rdbms ipc reply= + 1
reliable message= + 2
virtual circuit status= + 1

System Waits query's elapsed time was= 23 msec

---------------SQL EXECUTED DURING THIS REPORT DETECTED DURING SNAPSHOTS:---

HASH VALUE  SQL_ADDRESS              SQL_TEXT
3630001660  00000003975CD560  lock table carl.junk in exclusive mode
3364942409  00000003975D6AB8      DECLARE      reason_id    dbms_server_alert.REASON_ID_T := N
3364942409  00000003975D6AB8  ULL;      resource_id  NUMBER;      db_name      recent_resource
3364942409  00000003975D6AB8  _incarnations$.db_unique_name%TYPE :=                   :db_uniq
3364942409  00000003975D6AB8  ue_name;      inst_name    recent_resource_incarnations$.instanc
3364942409  00000003975D6AB8  e_name%TYPE :=                   :instance_name;      event_id  
3364942409  00000003975D6AB8     NUMBER := :event_id;      event_time   TIMESTAMP WITH TIME ZO
3364942409  00000003975D6AB8  NE      :=                   TO_TIMESTAMP_TZ(:event_time,       
3364942409  00000003975D6AB8                              'YYYY-MM-DD HH24:MI:SS.FF TZH:TZM', 
3364942409  00000003975D6AB8                                    'NLS_CALENDAR=''Gregorian''');
3364942409  00000003975D6AB8      BEGIN      CASE :reason_name        WHEN 'DATABASE_UP' THEN 
3364942409  00000003975D6AB8           reason_id := dbms_server_alert.RSN_FAN_DATABASE_UP;    
3364942409  00000003975D6AB8      WHEN 'DATABASE_DOWN' THEN          reason_id := dbms_server_
3364942409  00000003975D6AB8  alert.RSN_FAN_DATABASE_DOWN;        WHEN 'INSTANCE_UP'  THEN    
3364942409  00000003975D6AB8        reason_id := dbms_server_alert.RSN_FAN_INSTANCE_UP;       
3364942409  00000003975D6AB8    WHEN 'INSTANCE_DOWN' THEN          reason_id := dbms_server_al
3364942409  00000003975D6AB8  ert.RSN_FAN_INSTANCE_DOWN;        WHEN 'SERVICE_UP' THEN        
3364942409  00000003975D6AB8    reason_id := dbms_server_alert.RSN_FAN_SERVICE_UP;        WHEN
3364942409  00000003975D6AB8   'SERVICE_DOWN' THEN          reason_id := dbms_server_alert.RSN
3364942409  00000003975D6AB8  _FAN_SERVICE_DOWN;        WHEN 'SERVICE_MEMBER_UP' THEN         
3364942409  00000003975D6AB8   reason_id := dbms_server_alert.RSN_FAN_SERVICE_MEMBER_UP;      
3364942409  00000003975D6AB8    WHEN 'SERVICE_MEMBER_DOWN' THEN          reason_id := dbms_ser
3364942409  00000003975D6AB8  ver_alert.RSN_FAN_SERVICE_MEMBER_DOWN;        WHEN 'SVC_PRECONNE
3364942409  00000003975D6AB8  CT_UP' THEN          reason_id := dbms_server_alert.RSN_FAN_SVC_
3364942409  00000003975D6AB8  PRECONNECT_UP;        WHEN 'SVC_PRECONNECT_DOWN' THEN          r
3364942409  00000003975D6AB8  eason_id := dbms_server_alert.RSN_FAN_SVC_PRECONNECT_DOWN;      
3364942409  00000003975D6AB8    WHEN 'NODE_DOWN' THEN          reason_id := dbms_server_alert.
3364942409  00000003975D6AB8  RSN_FAN_NODE_DOWN;        WHEN 'ASM_INSTANCE_UP'  THEN          
3364942409  00000003975D6AB8  reason_id := dbms_server_alert.RSN_FAN_ASM_INSTANCE_UP;         
3364942409  00000003975D6AB8  WHEN 'ASM_INSTANCE_DOWN' THEN          reason_id := dbms_server_
3364942409  00000003975D6AB8  alert.RSN_FAN_ASM_INSTANCE_DOWN;      END CASE;      IF :use_res
3364942409  00000003975D6AB8  ource_id = 'Y' THEN        BEGIN          SELECT resource_id    
3364942409  00000003975D6AB8          INTO resource_id            FROM recent_resource_incarna
3364942409  00000003975D6AB8  tions$           WHERE resource_type = 'INSTANCE'             AN
3364942409  00000003975D6AB8  D db_unique_name = db_name             AND db_domain=NVL(SYS_CON
3364942409  00000003975D6AB8  TEXT('USERENV','DB_DOMAIN'),'==N/A==')             AND instance_
3364942409  00000003975D6AB8  name = inst_name             AND startup_time = (SELECT MAX(star
3364942409  00000003975D6AB8  tup_time)                                   FROM recent_resource
3364942409  00000003975D6AB8  _incarnations$                                  WHERE resource_t
3364942409  00000003975D6AB8  ype = 'INSTANCE'                                    AND db_uniqu
3364942409  00000003975D6AB8  e_name = db_name                                    AND db_domai
3364942409  00000003975D6AB8  n =                                        NVL(SYS_CONTEXT('USER
3364942409  00000003975D6AB8  ENV',                                                         'D
3364942409  00000003975D6AB8  B_DOMAIN'),                                            '==N/A=='
3364942409  00000003975D6AB8  )                                    AND instance_name = inst_na
3364942409  00000003975D6AB8  me                                    AND from_tz(startup_time, 
3364942409  00000003975D6AB8  '+00:00') <                                        event_time); 
3364942409  00000003975D6AB8         EXCEPTION          WHEN NO_DATA_FOUND THEN RETURN;       
3364942409  00000003975D6AB8     WHEN OTHERS THEN RAISE;        END;        event_id := 214748
3364942409  00000003975D6AB8  3648 + BITAND(event_id * 128, 2147483648-1)                     
3364942409  00000003975D6AB8            + resource_id;      END IF;      dbms_ha_alerts_prvt.p
3364942409  00000003975D6AB8  ost_ha_alert(        reason_id            => reason_id,        s
3364942409  00000003975D6AB8  ame_transaction     => FALSE,        clear_old_alert      => FAL
3364942409  00000003975D6AB8  SE,        database_unique_name => db_name,        instance_name
3364942409  00000003975D6AB8          => inst_name,        service_name         => :service_na
3364942409  00000003975D6AB8  me,        host_name            => :host_name,        incarnatio
3364942409  00000003975D6AB8  n          => :incarnation,        event_reason         => :even
3364942409  00000003975D6AB8  t_reason,        event_time           => event_time,        card
3364942409  00000003975D6AB8  inality          => :cardinality,        event_id             =>
3364942409  00000003975D6AB8   event_id,        timeout_seconds      => :alert_timeout_seconds
3364942409  00000003975D6AB8  ,        immediate_timeout    => :immed_timeout = 'Y',        du
3364942409  00000003975D6AB8  plicates_ok        => TRUE);    END;
3796581998  000000039ACC0F20  select s.sid, s.type, pid, spid, p.program, s.username, sw.event
3796581998  000000039ACC0F20  , sw.seq# seq,             sw.seconds_in_wait, sw.wait_time, sw.
3796581998  000000039ACC0F20  p1, sw.p2, sw.p3, sw.p1raw, sw.p2raw, sw.p3raw, ss.value,       
3796581998  000000039ACC0F20       s.sql_hash_value, s.sql_address, s.last_call_et from       
3796581998  000000039ACC0F20       v$session s, v$session_wait sw, v$process p, v$sesstat ss w
3796581998  000000039ACC0F20  here            (p.addr = s.paddr) and (s.sid = ss.sid)         
3796581998  000000039ACC0F20     and (s.sid = sw.sid) and (ss.statistic# = 12)             ord
3796581998  000000039ACC0F20  er by s.sid

Back to Contents

Legal Notices and Terms of Use

@keywords

CoE logo

 

 

REFERENCES

 
 

附件

   
 
 
 

相关内容

   
 
 

关键字

   
 
 

错误

   
 
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值