Oracle 11g RAC, ASM , Linux AS 5.3 64bit .
Oracle 11g 系統节点2 实例关闭, OS正常, 查看系统alert log 文件,发现报错信息如下:
Process J001 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 07:39:27 2011
Process m000 died, see its trace file
Tue May 31 07:41:25 2011
Process J001 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 07:56:28 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 07:58:15 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 07:59:29 2011
Process m000 died, see its trace file
Tue May 31 08:00:08 2011
Process m000 died, see its trace file
Tue May 31 08:00:31 2011
Process m000 died, see its trace file
Tue May 31 08:00:31 2011
Process m000 died, see its trace file
Tue May 31 08:00:32 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 08:01:49 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 08:02:49 2011
Process m000 died, see its trace file
Tue May 31 08:04:36 2011
Process PZ99 died, see its trace file
Tue May 31 08:05:01 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:06:41 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:07:37 2011
Process J000 died, see its trace file
Tue May 31 08:05:01 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:06:41 2011
Process PZ99 died, see its trace file
Process PZ99 died, see its trace file
Tue May 31 08:07:37 2011
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc:
Tue May 31 08:07:40 2011
Starting ORACLE instance (normal)
WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file
system to be mounted for at least 26239565824 bytes. /dev/shm is either not mounted or is mounted
with available space less than this size. Please fix this so that MEMORY_TARGET can work as
expected. Current available is 23652892672 and used is 9901539328 bytes. Ensure that the mount point
is /dev/shm for this directory.
memory_target needs larger /dev/shm
Tue May 31 08:08:11 2011
Starting ORACLE instance (normal)
WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file
system to be mounted for at least 26239565824 bytes. /dev/shm is either not mounted or is mounted
with available space less than this size. Please fix this so that MEMORY_TARGET can work as
expected. Current available is 23652892672 and used is 9901539328 bytes. Ensure that the mount point
is /dev/shm for this directory.
memory_target needs larger /dev/shm
Tue May 31 08:08:41 2011
Process PZ99 died, see its trace file
Tue May 31 08:09:03 2011
Process O000 died, see its trace file
Tue May 31 08:22:31 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:22:43 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:22:55 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:07 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:19 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:31 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:43 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:23:55 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:24:07 2011
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Tue May 31 08:24:36 2011
Error 29746: Cluster Synchronization Service is shutting down
Errors in file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_lmon_14499.trc:
ORA-29746: Cluster Synchronization Service is being shut down.
LMON (ospid: 14499): terminating the instance due to error 29746
Trace 文件:
Trace file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_cjq0_14800.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label
Security,
OLAP, Data Mining, Oracle Database Vault and Real Application Testing option
ORACLE_HOME = /u01/product/oracle/11.2.0/db_1
System name: Linux
Node name: wmrac02
Release: 2.6.18-128.el5
Version: #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine: x86_64
Instance name: ccptdb2
Redo thread mounted by this instance: 2
Oracle process number: 52
Unix process pid: 14800, image: oracle@wmrac02 (CJQ0)
*** 2011-05-27 22:00:00.212
*** SESSION ID:(1249.3) 2011-05-27 22:00:00.212
*** CLIENT ID:() 2011-05-27 22:00:00.212
*** SERVICE NAME:(SYS$BACKGROUND) 2011-05-27 22:00:00.212
*** MODULE NAME:() 2011-05-27 22:00:00.212
*** ACTION NAME:() 2011-05-27 22:00:00.212
*** TRACE FILE RECREATED AFTER BEING REMOVED ***
Setting Resource Manager plan SCHEDULER[0x3007]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Setting Resource Manager plan SCHEDULER[0x3008]:DEFAULT_MAINTENANCE_PLAN via scheduler window
*** 2011-05-28 06:00:00.184
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Setting Resource Manager plan SCHEDULER[0x3009]:DEFAULT_MAINTENANCE_PLAN via scheduler window
*** 2011-05-29 06:00:00.202
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
*** 2011-05-31 00:50:46.038
Process J001 is dead (pid=1752 req_ver=693 cur_ver=693 state=KSOSP_SPAWNED).
*** 2011-05-31 06:28:08.365
Process J002 is dead (pid=8632 req_ver=6101 cur_ver=6101 state=KSOSP_SPAWNED).
*** 2011-05-31 07:39:24.844
Process J001 is dead (pid=16773 req_ver=17167 cur_ver=17167 state=KSOSP_SPAWNED).
*** 2011-05-31 07:41:25.314
Process J001 is dead (pid=17000 req_ver=17168 cur_ver=17168 state=KSOSP_SPAWNED).
*** 2011-05-31 07:56:28.167
Process J000 is dead (pid=18573 req_ver=17132 cur_ver=17132 state=KSOSP_SPAWNED).
*** 2011-05-31 07:56:29.170
Process J000 is dead (pid=18575 req_ver=17170 cur_ver=17170 state=KSOSP_SPAWNED).
另外一个trace文件内容:
Trace file /u01/product/oracle/diag/rdbms/ccptdb/ccptdb2/trace/ccptdb2_lmon_14499.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label
Security,
OLAP, Data Mining, Oracle Database Vault and Real Application Testing option
ORACLE_HOME = /u01/product/oracle/11.2.0/db_1
System name: Linux
Node name: wmrac02
Release: 2.6.18-128.el5
Version: #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine: x86_64
Instance name: ccptdb2
Redo thread mounted by this instance: 2
Oracle process number: 11
Unix process pid: 14499, image: oracle@wmrac02 (LMON)
*** 2011-05-29 20:02:22.111
*** SESSION ID:(265.1) 2011-05-29 20:02:22.111
*** CLIENT ID:() 2011-05-29 20:02:22.111
*** SERVICE NAME:(SYS$BACKGROUND) 2011-05-29 20:02:22.111
*** MODULE NAME:() 2011-05-29 20:02:22.111
*** ACTION NAME:() 2011-05-29 20:02:22.111
*** TRACE FILE RECREATED AFTER BEING REMOVED ***
kjfc_TaskScheduler_Execute_wTime: timer wraps at 0xffffffe0 max 0xffffffdc
2011-05-31 08:24:36.817: [ CSSCLNT]clssgsGroupGetStatus: CSS shutting down.
*** 2011-05-31 08:24:36.817
2011-05-31 08:24:36.817: [ CSSCLNT]clssgsGroupGetStatus: returning 22
kgxgnpstat: error: CLSS service is shutting down
kjxgmcr: kgxgnpstat return 17
LMON caught an error 29746 in the main loop
error 29746 detected in background process
ORA-29746: Cluster Synchronization Service is being shut down.
*** 2011-05-31 08:24:36.818
LMON (ospid: 14499): terminating the instance due to error 29746
ksuitm: waiting up to [5] seconds before killing DIAG(14487)
根据提示信息:kkjcre1p: unable to spawn jobq slave process ,可以了解到是系统无法生成job相关的进程而出错的,那么大约有几种可能:
1、参数job_queue_processes(设置过小)
2、参数session和processes(设置的会话数及连接数不能满足业务需求)
3、参数pga_aggregate_target(被耗尽)
4、OS资源被耗尽,如virtual memory
SQL> show parameter process
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
cell_offload_processing boolean TRUE
db_writer_processes integer 8
gcs_server_processes integer 4
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 1000
注意,Oracle 11g 中采用了sga, pga 分享使用的方式,统一设置 memory_target 。 所以pga相关值为0 .
SQL> show parameter pga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 0
SQL> show parameter memo
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
hi_shared_memory_address integer 0
memory_max_target big integer 25024M
memory_target big integer 25024M
shared_memory_address integer 0
SQL>
另外 :
$ ps -ef | grep ora_ | grep -v grep
... ...
oracle 712918 1 0 Dec 28 - 2:47 ora_cjq0_CRMDB1
oracle 13230162 1 0 16:29:18 - 0:04 ora_j000_CRMDB1
oracle 3182624 1 0 16:30:28 - 0:00 ora_j001_CRMDB1
... ...
上面省略了部分Oracle的后台进程,上面的进程中,ora_j001_xxx和ora_j000都是由后台进程ora_cjq0产生的
slave process,这些ora_j000就是job进程,也正是由初始化参数 job_queue_processes控制其最大数量。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/35489/viewspace-696765/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/35489/viewspace-696765/