【故障处理】ORA-3113 "end of file on communication channel"

最新推荐文章于 2023-12-20 11:07:47 发布

六六子大顺1

最新推荐文章于 2023-12-20 11:07:47 发布

阅读量1.2k

点赞数

【故障处理】ORA-3113 "end of file on communication channel"

朋友数据库报错：

ORA-3113 "end of file on communication channel"

现象：suse主机，在主机上通过sqlplus sys/oracle@xxx:1521/orcl as sysdba可以连接，但是在windows上不可以。

解决：尝试过多种办法后，最后重建了一个1522端口的监听器后就可以连接了。证明1521端口有点问题。具体原因没有去分析。

Master Note: Troubleshooting ORA-03113 (文档 ID 1506805.1)

In this Document

Purpose

Troubleshooting Steps

(A) ORA-3113 when attempting to STARTUP Oracle

A1) Errors connecting as SYSDBA or on startup nomount

A2) Errors Mounting the database

A3) Errors on RECOVER DATABASE

A4) Errors on ALTER DATABASE OPEN

(B) ORA-3113 attempting to connect to the database using Oracle Net

(C) Client sees ORA-3113 running SQL / PLSQL

(D) Server trace reports ORA-3113

(E) Additional Checks / Information

References

APPLIES TO:

Oracle Database - Enterprise Edition - Version 9.2.0.1 to 12.1.0.1 [Release 9.2 to 12.1]
Oracle Net Services - Version 12.1.0.2 to 12.1.0.2 [Release 12.1]
Information in this document applies to any platform.
***Checked for relevance on 11-Jan-2016***

PURPOSE

An ORA-3113 "end of file on communication channel" error is a general error usually reported by a client process connected to an Oracle database. The error basically means 'I cannot communicate with the Oracle shadow process' . As it is such a general error more information must be collected to help determine what has happened - this error by itself does not indicate the cause of the problem. For example, ORA-3113 could be signalled for any of these following scenarios:

Server machine crashed
Your server process was killed at the O/S level
Network problems
Oracle internal errors (ORA-600 / ORA-7445) / aborts on the server
Client incorrectly handling multiple connections
etc.. etc.. etc.. - a lot of possible causes !!

This article describes what information to collect for an ORA-3113 error. It is common for this error to be accompanied by other errors such as:

ORA-1041 internal error. hostdef extension doesn't exist
ORA-3114 not connected to ORACLE (This error is reported before the ORA-3113 when PL/SQL is used)
ORA-1012 not logged on

These are all symptomatic of being disconnected from Oracle.

NOTE:
This note is an updated version of archived
Note 17613.1 - ORA-03113 on Unix - What Information to Collect
For older information, please refer to Note 17613.1

TROUBLESHOOTING STEPS

Please collect as much information as possible from the items listed below and submit this information to Oracle Support. Where a step produces an output file / trace file, this may be needed by Oracle Support to help determine the cause of the error. The more information you can provide in one go the better your chance of a speedy solution. Note that some sections may not be applicable.

If viewing this article on a Web Browser you can follow the links, otherwise manually go to the relevant section.

What Scenario does the ORA-3113 occur in ?

  A. When attempting to startup Oracle ?                    -> Section A 
  B. When attempting to make a connection to the database ? -> Section B 
  C. Client gets the error running SQL / PLSQL ?            -> Section C 
  D. Server trace file reports ORA-3113 ?                   -> Section D

You may find it useful to scan the checklist in Section E at the end of this article. This covers some questions / issues relevant to all problem sections.

(A) ORA-3113 when attempting to STARTUP Oracle

  
  There are several phases involved in starting up a database. If ORA-3113
  occurs during startup, then abort the instance and start up using the 
  sequence below. If an error occurs at any step then see the related notes 
  below.	a. Start any required services.		On error see A1
	   Eg: On Windows, start the OracleServiceSID
	b. Connect as a SYSDBA user.  		On error see A1
	   Eg:  sqlplus /nolog
		SQL> connect / as sysdba
	c. Startup nomount.  			On error see A1
	   Eg:
		SQL> startup nomount
	d. Mount the database. 			On error see A1 and A2 
	   Eg: 
		SQL> alter database mount;
	e. Recover the database 		On error see A3
	   Eg:
		SQL> recover database
	f. Open the database 			On error see A4
	   Eg: 	
		SQL> alter database open;
	g. Wait 3 minutes then issue a select.   On error see A4
	   Eg: 	
		SQL> select count(*) from DBA_OBJECTS;

A1) Errors connecting as SYSDBA or on startup nomount

	There is something fundamentally wrong with the software / environment
	if you cannot connect to SQLPlus as a DBA user. 
	The steps here cover errors such as ORA-3113, ORA-12547: TNS:lost contact
	or similar errors connecting to Oracle or starting the instance NOMOUNT.  
	Check the following items:
	A1.1)	If possible reboot the server disabling any automatic
		startup of Oracle before you do so. This may seem drastic
		but helps make sure you are working from a consistent 
		starting point.
	A1.2) 	Check your environment points to the expected ORACLE_HOME
		and ORACLE_SID.
                Check the USER_DUMP_DEST and BACKGROUND_DUMP_DEST and default
                trace directories under this environment for any user trace 
                files or alert log entries generated. These may help indicate
                the cause of the problem. 
                Eg: ORA-600[SKGMINVALID] may indicate a problem with the
                    shared memory UNIX parameters on UNIX systems.
                Try to show that any trace file / alert log entry you 
                find is truely related to the "CONNECT" command by re-issuing
                the "connect" and checking for a new trace file / alert entry
                at the time of the error.
        A1.3)   UNIX only:
		Some UNIX platforms need LD_LIBRARY_PATH to be set 
                correctly to resolve any dynamically linked libraries.
                As the user with the problem:
                        % script /tmp/ldd.out
                        % id
                        % cd $ORACLE_HOME/bin
                        % ldd oracle
                        % exit
                If the 'ldd' command does not exist, go to the next step below.
                Check that all lines listed show a full library file. If there
                are any 'not found' lines reported contact Oracle Support
                with the output of /tmp/ldd.out .
        A1.4)   UNIX only:
		Your 'oracle' executable may be corrupt. Relink it via a script command, e.g.
                        Log in as the 'oracle' user.
                        % script /tmp/relink.out
                        % cd $ORACLE_HOME/rdbms/lib
			% mv $ORACLE_HOME/bin/oracle $ORACLE_HOME/bin/oracle.dd.mon.yy
                        % rm -f ./oracle        
                        % make -f ins_rdbms.mk ioracle
                        % exit
                Depending on the version, you may have to use the 'relink [all]' command
                instead of the above 'make' command
                Please also refer to                Note 131321.1 - How to Relink Oracle Database Software on UNIX
                If this reports any errors Oracle Support will need to see the contents
                of the file /tmp/relink.out .
	
	A1.5)   If the error is on STARTUP NOMOUNT:
			Check the init.ora/spfile file used to start the database.
			This provides the configuration details used to
			configure the instance. To help isolate the problem,
			it may be useful to use a very basic init.ora/spfile
			when starting the instance. If this works then 
			parameters can be increased / introduced one at a 
			time to see if there is a problem with a particular
			setting.
                To correctly change the contents of an SPFILE, please refer to:                Note 137483.1 - How to Modify the Content of an SPFILE Parameter File
	A1.6) 	Check for server side trace files which may give more 
		indication as to what the underlying problem is.
		See Section C for details on how to check
		for server trace files.
		
	A1.7) 	Ensure there is free disk space in:
		  a. Your USER_DUMP_DEST and BACKGROUND_DUMP_DEST locations
                      NOTE:
                      These parameters are ignored by the new diagnosability infrastructure 
                      introduced in Oracle Database 11g Release 1 (11.1), which places 
                      trace and core files in a location controlled by the DIAGNOSTIC_DEST 
                      initialization parameter.
                      Please refer to:                      Note 422893.1 - 11g Understanding Automatic Diagnostic Repository (same for 12c)
		  b. Your AUDIT destination (Unix)
			The default is $ORACLE_HOME/rdbms/audit
        A1.8) Windows only:
              If the Server's sqlnet.ora file contains Authentication services 
              which are NOT reachable by Oracle, then an ORA-3113 error will
              result.
              For example, if the sqlnet.ora file contains the parameter:
              SQLNET.AUTHENTICATION_SERVICES = (NTS) and the Oracle database 
              is moved from a Windows NT Domain to an Active Directory one,
              or if a Domain Controller is introduced, then an error will result
              trying to start the database.
              Remove the sqlnet.authentication_services line so that Oracle 
              does not look for a non-existent KDC (Kerberos Domain Controller).

A2) Errors Mounting the database

	Check all the items in A1 first.
	If an error occurs when mounting the database there may be problems
	with the control files or datafiles, or with resources required to
	open these files.
	A2.1) 	The location of the control files are specified in the 
		init.ora/spfile.  Try mounting using each control file in 
		turn.
		eg: "Shutdown abort", 
		    modify the init.ora/spfile to refer to ONE of the control files only, 
		    "startup nomount", 
		    "alter database mount"
		Repeat for each control file to see if any control file works.
		
	A2.2)	It is possible to re-create the control files if you know the 
		location of all datafiles and online logs, or to restore an old
		backup control file. Always back up the current control files before
		restoring any backup copies or issuing a CREATE CONTROLFILE
		command.
		The steps for this are not documented here.
        A2.3)   Linux/UNIX only: 
                The 'strace' command can be used to help trace how far Oracle 
                gets before the error occurs. UNIX platforms usually have a 'truss' 
                (or 'tusc') command.
                Eg:  
                        % truss -o /tmp/truss.out -f sqlplus
                Keep the file /tmp/truss.out safe - Oracle Support MAY need to see it.
                Please refer to:                Note 110888.1 - How to Trace Unix System Calls

A3) Errors on RECOVER DATABASE

  	ORA-3113 during recover database is often related to a corruption on the
	database or redo stream which causes the shadow process to die. There should
	be a server side trace file produced for this sort of problem.
	See Section C for details on how to locate any trace files
	from both USER_DUMP_DEST and BACKGROUND_DUMP_DEST.
	A3.1)	If the "recover database" fails fairly quickly then it
		may help to collect the redo up to the point of failure as this
		may help identify where the problem is.
		Use the following commands just prior to the RECOVER DATABASE
		command:
		  SQL> alter session set max_dump_file_size=unlimited;
		  SQL> alter session set events
		  2> '10228 trace name context forever, level 10';
		  SQL> RECOVER DATABASE
		
		This causes redo information to be written to the user trace
		file. The last items of redo may help determine which file 
		has problems.
	A3.2)	If you do not have many datafiles in the database, it may be
		just as quick to recover each file in turn to see if this narrows
		down the problem.
		Eg: 
		  SQL> select name from v$datafile;
		
		and then for each file:
		  SQL> RECOVER DATAFILE 'full_file_name'
		If this gets to a problem file then back up the file and 
		use standard recovery options as if the file was lost.

A4) Errors on ALTER DATABASE OPEN

	Database open performs very many operations, so it is necessary
	to collect any trace information before determining the next steps.
	However, the following may help isolate the problem more quickly:
	A4.1) 	Move files out of your USER_DUMP_DEST and BACKGROUND_DUMP_DEST
		directory as these steps will generate a lot of trace.
	A4.2) 	Edit the init.ora/spfile and add the following:
			event="10046 trace name context forever, level 12"
			event="10015 trace name context forever, level 1"
			event="10228 trace name context forever, level 1"
		If you already have "EVENT=" lines in the init.ora file
		this MUST go directly below the other "Event=" lines.
                For setting events in an spfile, please refer to                Note 160178.1 - How To Set EVENTS In The SPFILE
		These lines will trace:
			SQL and BIND activity during startup
			REDO applied
			Information about transactional rollback required
	A4.3) 	Startup the instance as described at the top of this section.
		
		As soon as the error occurs REMOVE the above events from the init.ora/spfile
		file and shutdown. Collect together the trace files and alert logs
		as described in Section C

ORA-3113 issues at startup

Note 422646.1 - ORA-03113: End-of-file on Communication Channel Upon Startup of Database
Note 466056.1 - Database Startup Fails With ORA - 3113
Note 311166.1 - ORA-3113 and ORA-7445 in ksihsmrini during STARTUP NOMOUNT
Note 811656.1 - Database startup fails with ORA-1041 or ORA-3113 When an Event Is Set
Note 435989.1 - STARTUP UPGRADE FAILS WITH ORA-03113 AND ORA-00600[KCCCHB_3] IS REPORTED IN ALERT
Note 360834.1 - Consecutive shutdown/startup results in Ora-3113 when non-default NLS is used
Note 810046.1 - Database Startup Fails With ORA-03113 While ORA-00600[4:kgstmLdiToMicroTs] Appears In Alert Log
Note 1498721.1 - Startup gives ORA-3113 and Alert Log file not being appended

(B) ORA-3113 attempting to connect to the database using Oracle Net

      Oracle Net (Net8 or SQL*Net2) should report network related errors if a problem
   occurs whilst establishing a connection to a remote Oracle shadow process. 
   An ORA-3113 implies that the connection itself has been established but then 
   is lost. As such, follow the steps in Section C

(C) Client sees ORA-3113 running SQL / PLSQL

   
   If the ORA-3113 error occurs AFTER you have connected to Oracle then
   it is most likely that the 'oracle' executable has terminated unexpectedly.
        C1)     Determine which database you were connected to and 
                obtain the following init.ora/spfile parameter values:        
                        Parameter Default ~~~~~~~~~ ~~~~~~~
                        USER_DUMP_DEST          $ORACLE_HOME/rdbms/log
                        BACKGROUND_DUMP_DEST    $ORACLE_HOME/rdbms/log
                        CORE_DUMP_DEST          $ORACLE_HOME/dbs
                Eg: To find these log into SQL*Plus and issue:
                        SQL> show parameter dump
                NOTE:
                For changed locations in Oracle 11g and 12c, please refer to                Note 422893.1 - 11g Understanding Automatic Diagnostic Repository.(same for 12c)        C2)     Check in your 'USER_DUMP_DEST' for any Oracle trace files.
                It is important to find the correct trace file. 
		On UNIX: 
		  Use the command 'ls -ltr' to list files in time order with the 
                  latest trace files appearing LAST. The trace file will typically 
		  be of the form '<SID>_ora_<PID>.trc'.
		On Windows:
		  Click on the "Modified" column in Windows Explorer to sort the 
		  files by their modified date. Files will typically be of the form
		  'ORA<PID>.TRC'.
		
		If you are not sure which trace file may be relevant MOVE all 
		the current trace files to a different directory and reproduce 
		the problem.
        C3)     Check in your 'BACKGROUND_DUMP_DEST' for your alert log and
		any other trace files produced close to the time of the error.
		It should be named 'alert_<SID>.log'.		
                For changed locations in Oracle 11g and 12c, please refer to                Note 438148.1 - Finding alert.log file in 11g (same for 12c)

        C4)     UNIX only:
		If there is no trace file, check for a 'core' dump in the
                CORE_DUMP_DEST. Check as follows:
                        % cd $ORACLE_HOME/dbs   # Or your CORE_DUMP_DEST
                        % ls -l core*
                If there is a file called 'core' check that it's time matches the 
                time of the problem. If there are directories called 
                'core_<PID>', check for core files in each of these. It is 
                IMPORTANT to get the correct core file. Now obtain a stack
                trace from this 'core' file. Check each of the sequences below
                to see how to do this - one of these should work for your 
                platform.
                Please, refer to                Note 1812.1 - TECH: Getting a Stack Trace from a CORE file on Unix
                    If you have dbx:
                        % script /tmp/core.stack
                        % dbx $ORACLE_HOME/bin/oracle core
                        (dbx) where
                        ...
                        (dbx) quit
                        % exit
                        
                    If you have sdb:
                        % script /tmp/core.stack
                        % sdb $ORACLE_HOME/bin/oracle core
                        * t
                        ...
                        * q
                        % exit
                        
                    If you have xdb:
                        % script /tmp/core.stack
                        % xdb $ORACLE_HOME/bin/oracle core
                        (xdb) t
                        ...
                        (xdb) q
                        % exit
                    If you have adb:
                        % script /tmp/core.stack
                        % adb $ORACLE_HOME/bin/oracle core
                        $c
                        ...
                        $q
                        % exit
        C5)     Try to isolate the SQL command that is executing when
                the error occurs. Eg: Is it a particular SQL statement
                or PL/SQL block that causes the error ?
		In many cases this will be listed in the trace file
		produced under the heading "Current SQL statement", or
	 	near the middle of the trace file under the cursor referred
		to by the "Current cursor NN" line.
		If the trace file does not show the failing statement 
		then SQL_TRACE may be used to help determine this, provided
	  	the problem reproduces. SQL_TRACE can be enabled in most
		client tools:
                Eg: Product     Action
                    ~~~~~~~     ~~~~~~
                    SQL*Plus    Issue 'ALTER SESSION SET SQL_TRACE TRUE;'
                    Pro*        EXEC SQL ALTER SESSION SET SQL_TRACE TRUE;
                This should force a server side SQL trace file as detailed
                in C2 above. The trace file should give a clue as to what
                SQL was being executed.
        C6)     If no trace file can be found and the problem is reproducible
		then a SQL*Net trace may help to show what the latest operation sent 
		to the 'oracle' process was. 
                For Oracle Net (SQL*Net V2) tracing please refer to                Note 16564.1 - TECH: SQL*Net V2 on Unix - A Quick Guide to Setting Up Client Side Tracing
 
							
        C7)     Based on the information collected above try to put together a small
                test case which will reproduce the problem. This is important
                for two reasons:
                        a) It gives Oracle Support a small test case if the
                           problem does not look like a known problem.
                
                        b) It gives you a simple way to check if any patch
                           supplied will fix the problem.
	C8)	If a statement can be isolated which consistently raises an
		ORA-3113 error then it is worth spending some time collecting 
		additional information, such as:
			
			- An execution plan for the statement
			- Table definitions, column definitions
			- Information on constraints, triggers etc..
		ie: Any additional information about the statement which fails.
		    eg: If a SELECT fails then it may succeed if run under a 
			different optimizer mode.
	C9) 	Check if your server Administrator has any scripts which abort
		long running or CPU intensive processes. An ORA-3113 process
		can occur if someone kills your Oracle shadow process at the O/S
		level (Eg: kill -9 on UNIX).

Existing connections failing with ORA-3113

Note 1300824.1 - ORA-3113 ORA-3114 or ORA-12151 when using DCD.
Note 1104673.1 - Connections To Database Terminate With ORA-3113 end-of-file on communication channel
Note 578940.1 - ORA-03113 And ORA-03114 While Running Utlrp.sql
Note 413922.1 - ORA-03113 Error When Executing Utlrp.sql
Note 1125213.1 - ORA-03113, ORA-03114, ORA-01041 While Running "adadmin", "adpatch" or "adutlrcmp.sql"
Note 1096687.1 - RMAN Duplicate or Restore fails with ORA-3113 and ORA1403 on new Host
Note 603714.1 - 10.2.0.4 Catupgrd.sql Fails With ORA-03113 Creating SYS.KU$_XMLSCHEMA_VIEW
Note 1208712.1 - ORA-03113 ORA-07445 Errors When Trying To Create Materialized View

(D) Server trace reports ORA-3113

	D1) 	It is unusual for a server trace to report ORA-3113.
		However, this can occur if the server loses contact with 
		the client OR a database link connection.
		Treat this the same as an ORA-3113 in a client process
		and follow the steps in Section C.
	D2) 	The following event may be added to the init.ora/spfile
		to help collect maximum information when the error occurs:
			event="3113 trace name errorstack level 3"
		If you already have "EVENT=" lines in the init.ora file
		this MUST go directly below the other "Event=" lines.
                For setting events in an spfile, please refer to                Note 160178.1 - How To Set EVENTS In The SPFILE

(E) Additional Checks / Information

        E1)     Is it just this one tool that encounters the error or 
                do you get ORA-3113 from any tool doing a similar operation ?
                If the problem reproduces in SQL*Plus use this in all tests 
                below.
        E2)     Linux/UNIX only:
		Check if the problem is just restricted to:
                        [ ] One particular Linux/Unix user,
                        [ ] Any Linux/Unix user 
                    or  [ ] Any Linux/Unix user EXCEPT the Oracle user.
        E3)     Check if the problem is just restricted to:
                        [ ] One particular ORACLE logon
                    or  [ ] Any ORACLE logon that has access to the 
                                relevant tables.
        E4)     If this is a client-server set up, does this occur from:
                        [ ] Any client 
                        [ ] Just one particular client 
                    or  [ ] Just one group of clients ?
                            If so what do these clients have in common ?
                            Eg: Software release .
                
        E5)     Do you have a second server or database version where the
                same operation works correctly ?
	E6) 	Ensure there is free disk space in:
		  a. Your USER_DUMP_DEST and BACKGROUND_DUMP_DEST locations
                      (or DIAGNOSTIC_DEST for 11g and 12c).
		  b. Your AUDIT destination (Linux/Unix)
			The default is $ORACLE_HOME/rdbms/audit

REFERENCES

NOTE:422893.1 - Understanding Automatic Diagnostic Repository.
NOTE:1812.1 - TECH: Getting a Stack Trace from a CORE file on Unix
NOTE:110888.1 - How to Trace Unix System Calls
NOTE:16564.1 - TECH: SQL*Net V2 on Unix - A Quick Guide to Setting Up Client Side Tracing
NOTE:137483.1 - How to Modify the Content of an SPFILE Parameter File
NOTE:438148.1 - How to Find the alert.log File (11g and Later)
NOTE:131321.1 - How to Relink Oracle Database Software on UNIX
NOTE:883299.1 - Oracle 11gR2 Relink New Feature

About Me

  ........................................................................................................................  
  ● 本文作者：小麦苗，部分内容整理自网络，若有侵权请联系小麦苗删除  
  ● 本文在itpub（  http://blog.itpub.net/26736162  ）、博客园（  http://www.cnblogs.com/lhrbest  ）和个人weixin公众号（  xiaomaimiaolhr  ）上有同步更新  
  ● 本文itpub地址：  http://blog.itpub.net/26736162   
  ● 本文博客园地址：  http://www.cnblogs.com/lhrbest   
  ● 本文pdf版、个人简介及小麦苗云盘地址：  http://blog.itpub.net/26736162/viewspace-1624453/   
  ● 数据库笔试面试题库及解答：  http://blog.itpub.net/26736162/viewspace-2134706/   
  ● DBA宝典今日头条号地址：  http://www.toutiao.com/c/user/6401772890/#mid=1564638659405826   
  ........................................................................................................................  
  ● QQ群号：  230161599  （满）  、618766405   
  ● weixin群：可加我weixin，我拉大家进群，非诚勿扰  
  ● 联系我请加QQ好友  （   646634621   ）  ，注明添加缘由  
  ● 于 2018-12-01 06:00 ~ 2018-12-31 24:00 在魔都完成  
  ● 最新修改时间：2018-12-01 06:00 ~ 2018-12-31 24:00  
  ● 文章内容来源于小麦苗的学习笔记，部分整理自网络，若有侵权或不当之处还请谅解  
  ● 版权所有，欢迎分享本文，转载请保留出处  
  ........................................................................................................................  
  ●  小麦苗的微店  ：  https://weidian.com/s/793741433?wfr=c&ifr=shopdetail   
  ●  小麦苗出版的数据库类丛书  ：  http://blog.itpub.net/26736162/viewspace-2142121/   
  ●  小麦苗OCP、OCM、高可用网络班  ：  http://blog.itpub.net/26736162/viewspace-2148098/   
  ●  小麦苗腾讯课堂主页  ：  https://lhr.ke.qq.com/   
  ........................................................................................................................  
 使用   weixin客户端   扫描下面的二维码来关注小麦苗的weixin公众号（  xiaomaimiaolhr  ）及QQ群（DBA宝典）、添加小麦苗weixin，  学习最实用的数据库技术。  
  
  ........................................................................................................................