[Release:
11.2 to 11.2]
Information in this document applies to any platform.
Oracle Server Enterprise Edition - Version: 10.1.0.3 to
10.1.0.5
Oracle Server Enterprise Edition - Version: 10.2.0.1 to
10.2.0.4
Oracle Server Enterprise Edition - Version: 11.1.0.6 to 11.1.0.7
Purpose
Streams capture process can exist in different states. The
different states of the capture process is explained in the
following note.
Note 471713.1 Different States of Capture &
Apply Process
This article provides troubleshooting steps when capture is
stuck in INITIALIZING/DICTIONARY INITIALIZATION, WAITING FOR
DICTIONARY REDO: FIRST SCN ;
WAITING FOR REDO: LAST SCN MINED
states.
Last Review
Date
March 5, 2012
Instructions for the Reader
A
Troubleshooting Guide is provided to assist in debugging a specific
issue. When possible, diagnostic tools are included in the document
to assist in troubleshooting.
Troubleshooting Details
Capture process does not advance
and appears stuck in one of the following states
:
select * from v$streams_capture shows :
1. INITALIZING / DICTIONARY INITIALIZATION (the state
alternates between these states), or
2. WAITING FOR DICTIONARY REDO: FIRST SCN , or
3. WAITING FOR REDO: LAST SCN MINED , or
4. WAITING FOR DICTIONARY REDO: FILE
The nature of the problem could be :
- a missing logfile; or
- a logfile is not registered; or
- a logfile is corrupted; or
- Capture is verifying/prechecking logfiles
The process will remain in this state until the log is located, it
is registered, the corruption is resolved in which case it will
also be necessary to reregister the log or the Capture process has
checked the logfiles on disk
CAUSE
Common issues which can cause Capture to stop in this state are
:
1.Incorrect use of RMAN backup command
or RMAN Issue
Only certain RMAN commands are Streams aware.
There are also a number of RMAN related issues where logfiles can
be deleted when Streams may still need to access these.
Refer to Note
421176.1 for more details.
2.Logs Stored in FRA - Flash Recovery
Area
It is recommended that the FRA should be avoided with
Streams.
Archive log files can be deleted and ORA-1291 reported when archive
logs are located in a Flash Recovery Area (FRA). More details can
be found with respect to FRA in Note 305648.1. This is a recognised behaviour and for this
reason it is recommended that archive logs not be located in a FRA
where a Capture process will mine from this location.
This is documented in the Oracle Streams Concepts and
Administration 10g Release 2 (10.2) guide under the section
Troublshooting Capture Problems subsection : Are Required Redo Log
Files Missing.
It should also be pointed out that archive logs may be deleted from
the FRA when the instance is shutdown and restarted. Those archive
logfiles deleted logs should be reported in the alert.log file.
3.Logs not registered
in dba_registered_archived_log
This issue is more likely at a downstream database. This could
be due to transport issues from upstream to downstream
database.
4.Logs marked as
corrupted in system.logmnr_log$
Manual intervention will likely be necessary to correct
this.
5.required_checkpoint_scn
issues
Logs may need to be manually deleted which can then affect
Capture restart if the required_checkpoint_scn (dba_capture view)
is not advancing. This is a key value for Capture as discussed
below.
6.Capture Prevalidation of Logs in
dba_registered_archived_log
The Capture process has to access all the logs in
dba_registered_archived_log before it can start mining changes.
Thus, in the case of message :WAITING FOR
DICTIONARY REDO: FILE, it may be that logs are being accessed but
the state of the capture process has not been updated to indicate
that it is prevalidating logs.
7.Logs deleted in error or damaged due
to OS/Storage issues
The following is useful background which should be used to help
understand what logs are relevant.
In a non RAC environment, all logs from the logfile containing the
required_checkpoint_scn have to be present on disk.
In a RAC environment, all logs from all threads have to be present
with respect to the required_checkpoint_scn; this is also
applicable for downstream Capture environments where the primary /
upstream database is RAC related but the downstream database may be
single instance.
These logfiles are also added to the dba_registered_archived_log
view.
Before Capture starts mining changes , all the logs in the
dba_registered_archived_log have to be located. On downstream
Capture environments, this can present confusion as the Capture
process may appear to be stuck / not moving (due to large numbers
of logs in the view) whilst in fact it is actually accessing log
files. This is more of an issue for Downstream Capture environments
where a Capture process may be down for some time but logs are
still being registered by the RFS processes associated with the
primary database.
The logfiles that a Capture process requires on restart as
specified by the required_checkpoint_scn can be located from the
Streams Healthcheck report output - Note 273674.1.
Refer to the section headed :
'++ Minimum Archive Log Necessary to Restart Capture
++'
Note: the same information can also be obtain
from
8.Capture can stuck due to missing
archive logs
Logs are registered in dba_registered_archived_log.
Logs does not exist in the archive location or damaged.
SOLUTION
In order to determine the problem
logfile(s) and understand what is going on please
provide to Oracle Suport Services the Streams Healthcheck output
(Note
273674.1)) and the alert.log file(s) from Upstream (if
relevant) and downstream databases.
Note: please provide details from all nodes.
Additionally, please collect the following:
1.SCN reported in the
Capture state related message - determine logs the
SCN relates to
connect / as sysdba
set pagesize 1000
col first_scn format 999999999999999999
col next_scn format 999999999999999999
alter session set nls_date_format='dd-mon-yyyy hh24:mi:ss';
select
source_database,thread#,sequence#,name,modified_time,first_scn,next_scn,dictionary_begin,dictionary_end
from dba_registered_archived_log where
reported> between first_scn and
next_scn;
Note : If the reported is the corresponds to
the next_scn SCN value for a thread then it is likely the problem
is with the next log in that thread; it may be missing or not
registered.
2.Check Capture
tracefile to understand if mising
logfile detailed
If no trace file is present,
identify the Capture and Reader processes from the following :
connect / as sysdba
set pagesize 1000
-- Capture
define capture_name=''
select p.spid Spid, 'C00'||c.capture#||' '||upper(lp.role) "Capture
Process ", c.capture_name "Capture Name", p.pga_used_mem "PGA
Memory Used", p.pga_alloc_mem "PGA Memory Allocated", p.pga_max_mem
"PGA Maximum Memory"
from v$streams_capture c, v$logmnr_process lp, v$session s,
v$process p
where c.logminer_id = lp.session_id
and lp.role in ('reader','preparer','builder')
and lp.sid = s.sid
and lp.serial# = s.serial#
and s.paddr = p.addr
and c.capture_name = '&capture_name'
union
select p.spid, 'C00'||c.capture#||' Coordinator', c.capture_name,
p.pga_used_mem, p.pga_alloc_mem, p.pga_max_mem
from v$streams_capture c, v$session s, v$process p
where c.sid = s.sid
and c.serial# = s.serial#
and s.paddr = p.addr
and c.capture_name = '&capture_name'
order by 6,5;
Use the SPID (OS Process or thread id) for both the Capture process
and Reader process as follows :
Note: it may be that a reader process does not
exist at the point a particular state is being reported.
connect / as sysdba
-- Capture BOTH Process and Reader Process :
oradebug setospid
-- event 1349 / level 32768 decimal
oradebug event 1349 trace name context forever, level 32768
oradebug setospid -- event 1349 / level 32768 decimal oradebug event 1349 trace name context forever, level 32768 -- wait 5 minutes, disable trace :
oradebug event 1349 trace name context off
oradebug setospid oradebug event 1349 trace name context off
Check the background_dump_dest location for trace files containing
the process ids (spids) from the above. The trace
files should identify the thread / sequence number and the
problematic logfile which Capture is looking for.
Note: ensure that you have the latest
patchset installed otherwise relevant trace information may not be
present.
If it is still not obvious what the file being accessed is specify
more complete event trace :
oradebug event 1349 trace name
context forever, level 26148863
-- wait 5 minutes, disable trace :
oradebug event 1349 trace name context off
For example, in a Capture trace file you may see something like the
following :
krvxalfs: Error 308 raised while opening log
/bugmnt21/em/celclnx8/tar20093554.6/app/oracle/product/10.2.0.3/dbs/arch1_35_677241985.dbf.
i.e ora-308 indicates : missing logfile
3.Determine if there are logfiles
unregistered
If there is a gap in the registered logfiles, the
Capture process will stop.
Specify the name of your capture process below as well as thread
number, i.e repeat for each thread used.
Note : The plsql block will need be be run
more than once for RAC configurations.
connect / as sysdba
set serveroutput on
declare
-- amend the following as required.
user_thread number := 1; -- change as necessary
user_capture varchar2(100) := '';
--change as necessary
rcscn number;
minseq number;
maxseq number;
rseq number;
cnt number;
cursor mlsns is select user_thread thread, sequence# from (
select (lvl + minseq) sequence# from (select * from (select level
lvl from dual connect by level <= maxseq -
minseq))
minus
select sequence# from dba_registered_archived_log where
thread#=user_thread)
order by 1;
begin
select required_checkpoint_scn into rcscn from dba_capture where
capture_name = user_capture;
select min(sequence#) into rseq from dba_registered_archived_log
where thread# = user_thread and consumer_name = user_capture and
rcscn between first_scn and next_scn;
select min(sequence#) into minseq from dba_registered_archived_log
where thread# = user_thread and consumer_name = user_capture and
sequence# >= rseq;
select max(sequence#) into maxseq from dba_registered_archived_log
where thread# = user_thread and consumer_name = user_capture;
select count(*) into cnt from (
select (lvl + minseq) sequence# from (select * from (select level
lvl from dual connect by level <= maxseq -
minseq))
minus
select sequence# from dba_registered_archived_log where
thread#=user_thread) ;
if (cnt is not null) then
for rec in mlsns loop
dbms_output.put_line('Log not registered ' ||rec.sequence#);
end loop;
else
dbms_output.put_line('There are no gaps in registered
logs.');
end if;
end;
/
For any logfile which are not registered, ensure that these are on
disk and register these using :
alter database register or replace
logical logfile '&fqfn' for
'&capture_name';
4.Determine if there
are logs marked as corrupt
A corrupt archive log file is indicated by the contents field of
system.logmnr_log$ having value 16.
connect / as sysdba
set pagesize 1000
select * from system.logmnr_log$ where contents = 16 order by
sequence#;
If any logs as marked as corrupted , proceed as follows :
- dump out the logfile; the dump pf the logfile will appear in the
user_dump_dest location.
connect / as sysdba
oradebug setmypid
oradebug unlimit
alter session set events '1354 trace name context forever, level
32768';
alter session set events '1348 trace name context forever, level
1032';
alter system dump logfile '&fqfn';
It may be that logfile will need to be restored or copied once
again from the primary.
One the above has been done the logfile can be reregistered as
follows :
- take a backup/copy of system.logmnr_log$ :
connect / as sysdba
create table system.logmnr_log$_bak as select * from
system.logmnr_log$;
- delete the entry in system.logmnr_log$ which is marked as
corrupted :
delete from system.logmnr_log$
where contents = 16 and thread# = and sequence# = ;
commit;
- reregister the logfile once again as detailed
above.
5.RMAN - Recovery Manager can be used to
identify archivelog files which may be missing
Note: this can only be used on upstream
environments - not downstream databases. This is due to the fact
that no controlfile entry is created for a registered logfile in
the downstream database.
For Oracle 10.2, logfiles being mined by a Capture process are
also logged to the alert.log. This information together with the
Capture restart information from the Streams Healthcheck as well as
logfile validation information reported by RMAN can be useful in
identifying those logfile(s) that are missing.
The RMAN command 'crosscheck archivelog' command can check the
physical presence of an archive log file.
Note: you do not have to be familiar with RMAN
for backup and restore or have any experience of RMAN to use the
commands detailed below.
The approach when using RMAN is as follows :
- First of all determine where a Capture process will start from it
were restarted. This would be a convenient point to specify where
to check the existence of logs from.
- Record the restart SCN , the Thread number and Log Sequence
number.
Note: The Thread number and the Log Sequence
number should typically be visible in the logfile name. This may
not be the case however and will depends on the setting of the
parameter : log_archive_format.
Use either the restart Capture SCN (YYYYYYYYY) or the Log Sequence
(ZZZZ) number from which to start the crosscheck validation. The
thread number can also be specified on some commands (X); in order
to understand if RAC or multiple threads are present issue :
select * from
gv$instance;
Having performed the above, start RMAN
:
Note :
1. If you specify a logfile you will receive no RMAN command output
to the screen. Therefore, whilst getting to grips with RMAN you
would be advised not to specify a logfile. Please also use set echo
on so that reevant commands can be seen in log file.
2. for problems relating to the running of RMAN , refer to the
Backup and Recovery Advanced User's Guide.
-- start rman without specifying a
logfile.
rman target /
-- start rman and specify a logfile to record command output
rman target / log /tmp/crosscheck.log
Useful commands to allow logs on disk to be crosschecked
on disk are as follows :
-- ensure command used is echoed to log file.
set echo on
- crosscheck archivelog all;
(crosscheck all logs on disk)
- crosscheck archivelog from sequence ZZZZ thread X;
(* crosscheck all the logs on disk from the sequence number)
- crosscheck archivelog from scn = YYYYYYYYYY;
(* crosscheck all the logs on disk from the scn)
- crosscheck archivelog like '%%';
(crosscheck all the logs on disk from a particular location)
- crosscheck archivelog like '%%';
(crosscheck that a specific logfile exists)
* above allow the starting point of a the crosscheck command to
be specified. Therefore , these should be used together with
restart information detailed in the Streams healthcheck.
Consider the following example where the Streams Healthcheck
reports that the capture process will restart from :
Capture will restart from SCN
2381906 in the following file:
/la2/1_228_657120968.dbf
(date/time)
/la1/1_227_657120968.dbf
(date/time)
The following commands can be used in order to determine whether
logs
are available on disk or otherwise.
rman target /
set echo on
- crosscheck archivelog all;
- crosscheck archivelog from sequence 227 thread 1;
- crosscheck archivelog from scn = 2381906;
- crosscheck archivelog like '%la1%';
- crosscheck archivelog like '%1_227_657120968.dbf%';
Checking RMAN output
If a logfile is not found, RMAN will report : 'validation failed
for archived log' error.
Note: This is really only significant if a
log is missing after the required_checkpoint_scn where a Capture
process may restart from. If a logfile is found, RMAN will report
:
'validation succeeded for archived log'
....
archive log filename=
/arch1_127_657120968.dbf recid=127 stamp=673799334
validation failed for archived log
archive log filename=
/arch1_128_657120968.dbf recid=128 stamp=673806193
validation succeeded for archived log
...
To exit RMAN, type :
exit
Search through the logfile for 'failed' keyword. This will
hopefully identify the problematic logfile.
i.e (on Unix) grep -i failed /tmp/crosscheck.log
If all logfiles are validated and RMAN performed the crosscheck
validation from the location that Capture would restart from , then
the logs missing may likely come after the last logs reported by
RMAN. Further inspection may be required.
6.Restore the archive
logs
Get all the archive logs required for the capture process to get
restarted :
A) When MISSING archive logs CAN be
restored.
Refer : Doc ID 290143.1 "Minimum Archived Log Necessary to Restart 10g and 11g
Streams Capture Process"
The output of the above query will give you
the name of the archive logs which are required for the capture
process.
Make sure all the archives are available to
the capture process from this archive sequence number.
Restore the archive logs from this sequence and register them for
the capture process to progress.
SQL>alter database register or replace logical
logfile '&fqfn' for
'&capture_name';
B) When MISSING archive logs CANNOT be
restored.
In case of restoration of the missing archive logs is not possible
due to unavailability of the backup of the old archive logs, Then
we need to recreate the capture process.
NOTE: There is no way that the capture process can skip these
missing archive logs and jump to the next available archive log.
Streams gets stuck when the archive logs are not available.
Refer Doc ID 471695.1 to recreate the capture process.
Still have questions ?
To discuss this information further with
Oracle experts and industry peers, we encourage you to review, join
or start a discussion via My Oracle Support Streams and Distributed Database Community
Enjoy a short Video about Oracle´s Support Communities - to quickly
understand it´s benefits for you right now (http://bcove.me/tlygjitz)
References
BUG:5770059
- REQUIRED_CHECKPOINT_SCN NOT MOVING FORWARD
BUG:6154377 - NO LOGMNR CHECKPOINTS ARE BEING EXECUTED
NOTE:290143.1 - Minimum Archived Log Necessary to Restart 10g
and 11g Streams Capture Process
NOTE:471695.1 - Required Steps to Recreate a Capture
Process
相关内容
产品
Oracle Database Products > Oracle Database
> Oracle Database > Oracle Server -
Enterprise Edition
Oracle Database Products > Oracle Database
> Oracle Database > Oracle Server -
Enterprise Edition
Oracle Database Products > Oracle Database
> Oracle Database > Oracle Server -
Enterprise Edition
Oracle Database Products > Oracle Database
> Oracle Database > Oracle Server -
Enterprise Edition
关键字
ARCHIVED LOGS; ARCHIVELOG; AREA; CAPTURE PROCESS;
DBA_REGISTERED_ARCHIVED_LOG; DICTIONARY INITIALIZATION; FIRST_SCN;
FLASH RECOVERY AREA; FRA; REDO; RMAN; STREAMS; WAITING FOR
DICTIONARY REDO; WAITING FOR REDO
错误
ORA-308; ORA-1291