Purpose
Procwatcher is a tool to examine and monitor Oracle database and/or clusterware processes at an interval. The tool will collect stack traces of these processes using Oracle tools like oradebug short_stack and/or OS debuggers like pstack, gdb, dbx, or ladebug and collect SQL data if specified.
Requirements
- Must have /bin and /usr/bin in your $PATH
- Have your instance_name or db_name set in the oratab and/or set the $ORACLE_HOME env variable.(PRW searches the oratab for the SID it finds and if it can't find the SID in the oratab it will default to $ORACLE_HOME). Procwatcher cannot function properly if it cannot find an $ORACLE_HOME to use.
- Run Procwatcher as the oracle software owner if you are only troubleshooting homes/instances for that user. If you are troubleshooting clusterware processes (EXAMINE_CLUSTER=true or are troubleshooting for multiple oracle users) run as root.
- If you are monitoring the clusterware you must have the relevant OS debugger installed on your platform; PRW looks for:
Linux - /usr/bin/gdb
HP-UX and HP Itanium - /opt/langtools/bin/gdb64 or /usr/ccs/bin/gdb64
Sun - /usr/bin/pstack
IBM AIX - /bin/procstack or /bin/dbx
HP Tru64 - /bin/ladebug
Procwatcher is Ideal for:
- Session level hangs or severe contention in the database/instance. See Note: 1352623.1
- Severe performance issues. See Note: 1352623.1
- Instance evictions and/or DRM timeouts.
- Clusterware or DB processes stuck or consuming high CPU (must set EXAMINE_CLUSTER=true and run as root for clusterware processes)
- ORA-4031 and SGA memory management issues. (Set sgamemwatch=diag or sgamemwatch=avoid4031 (not the default). See Note: 1355030.1
- ORA-4030 and DB process memory issues. (Set USE_SQL=true and process_memory=y).
- RMAN slowness/contention during a backup. (Set USE_SQL=true and rmanclient=y).
Procwatcher is Not Ideal for...
- Node evictions/reboots. In order to troubleshoot these you would have to enable Procwatcher for a process(es) that are capable of rebooting the machine. If the OS debugger suspends the processs for too long *that* could cause a reboot of the machine. I would only use Procwatcher for a node eviction/reboot if the problem was reproducing on a test system and I didn't care of the node got rebooted. Even in that case the INTERVAL would need to be set low (30) and many options would have to be turned off to get the cycle time low enough (EXAMINE_BG=false, USE_SQL=false, probably removing additional processes from the CLUSTERPROCS list).
- Non-severe database performance issues. AWR/ADDM/statspack are better options for this...
- Most installation or upgrade issues. We aren't getting data for this unless we are at a stage of the installation/upgrade where key processes are already started.
Procwatcher User Commands
To start Procwatcher:
Or if you want to start on all nodes in a clustered environment:
To stop Procwatcher: :
Or if you want to stop on all nodes in a clustered environment:
To check the status of Procwatcher:
To package up Procwatcher files to upload to support:
All user syntax available: