impdp 调优参数大全

jnrjian

已于 2024-09-20 17:04:07 修改

阅读量1.2k

点赞数 10

文章标签： oracle 数据库

于 2024-09-17 02:04:03 首次发布

本文链接：https://blog.csdn.net/jnrjian/article/details/142309749

版权

DATA_OPTIONS = TRUST_EXISTING_TABLE_PARTITIONS

data_options=disable_append_hint

TABLES=<SCHEMA_NAME>.<TABLE_NAME>:<PARTITION_NAME>
PARTITION_OPTIONS=DEPARTITION
REMAP_TABLE=<SCHEMA_NAME>.<TABLE_NAME>:<PARTITION_NAME>:<TABLE_NAME>
CONTENT=ALL
SKIP_UNUSABLE_INDEXES=YES
DIRECTORY=EXPDP
DUMPFILE=<DUMP_NAME>_%U.dmp
LOGFILE=<LOG_NAME>.log
PARALLEL=8
DATA_OPTIONS=skip_constraint_errors

--logtime=all: 记录时间
impdp logtime=all test/test DIRECTORY=dpump1 DUMPFILE=export.dmp schemas=test

---DIRECT_PATH 对于大表效果不好没有并行

%expdp system/<password> ... ACCESS_METHOD=DIRECT_PATH
%expdp system/<password> ... ACCESS_METHOD=EXTERNAL_TABLE
%impdp system/<password> ... ACCESS_METHOD=DIRECT_PATH
%impdp system/<password> ... ACCESS_METHOD=EXTERNAL_TABLE

11g 12c都建议drop 表在impdp，但是12多了参数TRUST_EXISTING_TABLE_PARTITIONS

----

Goal

How to limit the amount of UNDO generation during an Import Data Pump job ?

Solution

1 The most common cause of high UNDO generation during an IMPDP job is that the table pre-exists in the target database and one or more indexes exist on that table. During the import job, those indexes will be maintained for every row inserted. To resolve that problem, ensure that the table that needs to be imported does not pre-exist in the target database, but let import create the table.

This means: do not use TABLE_EXISTS_ACTION=APPEND or TABLE_EXISTS_ACTION=TRUNCATE parameter for the Import Data Pump job unless this is really required.

2. If a large amount of data needs to be imported into an existing table and you have to use the TABLE_EXISTS_ACTION=APPEND, then:
- run an export of the pre-existing table in the target database so you can restore any data and DDL of related objects if needed;
- check if there are triggers on the existing table in the target database; if there are any, consider dropping them if possible;
- ensure there is no primary key (index) on the table (if the objective is to avoid duplicate rows, then you must keep the primary key constraint - this will however cause undo generation for the related index during the import job);
- check if there are other indexes on the existing table in the target database; if there are any, consider dropping them if possible (note that impdp with EXCLUDE=INDEX won't skip index maintenance of any existing indexes during the import job);
- import the data into the pre-existing table in the target database;
- re-create the index(es) and primary key and the trigger(s).

-------------------------参数大全--------------

To provide a list of DataPump and RDBMS specific parameters that can significantly affect the speed of DataPump API based operations.

Solution

Data Pump Performance Affecting Parameters

At the Data Pump API level:

ACCESS_METHOD=...

In some cases the access method chosen by the Data Pump API is not the fastest access method for your data set.
In such a case, you will not know this unless you explicitly try each of the available access methods for this parameter.
In other words, trial and error testing is the only way to know which will work best for your data set.
The current choice in access methods to choose from are DIRECT_PATH and EXTERNAL_TABLE.

COMPRESSION=ALL

The expdp job is slower when compression=all is used. Data is parsed twice when compression is enabled on DataPump.
Please refer to Document 2327569.1 - Why Does DataPump Export (EXPDP) With COMPRESSION=ALL Take Longer Than A DataPump Job With Default Compression?.

CLUSTER=N

In a RAC environment it can improve the speed of Data Pump API based operations.

Note that since this parameter is limited to Data Pump API operations only, in a RAC environment, I would first recommend setting this parameter to N, and only if setting it to N has no effect would I also test setting the database parameter named PARALLEL_FORCE_LOCAL to a value of TRUE since PARALLEL_FORCE_LOCAL could have a wider scope of effect than just Data Pump API based operations.

DATA_OPTIONS=DISABLE_APPEND_HINT

This impdp only parameter, under very specific conditions, can be safe to use to possibly reduce the time of importing data.

All of the conditions listed below must be met to be safely using the DISABLE_APPEND_HINT option of the DATA_OPTIONS parameter ...

Condition 1:

The import operation is importing data into pre-existing tables, partitions or sub-partitions.
The term "pre-existing" is to mean in this case that the objects already exist at the time that the import job is started.

Condition 2:

The number of pre-existing objects being imported into is very small (Presumed to be 10 or less in number).

Condition 3:

Only select statements will be issued by other RDBMS sessions against objects to be imported into while that import job is running.

Note that the Data Pump specific import parameter value of DISABLE_APPEND_HINT for the parameter named DATA_OPTIONS is available only as of version 11.2.0.1 and higher of the Data Pump API.

Note that the time saving benefit of using the DISABLE_APPEND_HINT value is only expected to be present if the wait times for locks to be released by other sessions are going to be significantly long.

ESTIMATE

The ESTIMATE parameter has two mutually exclusive possible assigned values to chose from, one is BLOCKS and the other is STATISTICS.
The BLOCKS approach at estimating the size of the data set that is being exported is expected to consume much more time to complete than using the STATISTICS approach for doing the same thing.
Presumably, the BLOCKS approach to estimating the size of the data set to be processed is more accurate than the STATISTICS approach to do the same thing.
Prior to 11.2.0.4 or with unpatched 11.2.0.3 versions of the RDBMS the ESTIMATE=STATISTICS effectively did the STATISTICS approach which is expected to be much less time consuming to perform then the BLOCK approach and then due to a bug the BLOCK approach was then executed in addition to the STATISTICS approach, even if only the STATISTICS approach is being chosen.
The end result was that if you chose the STATISTICS approach then the time to perform the estimate was actually longer than chosing the BLOCKS approach since when using BLOCKS approach only the BLOCKS approach was used performed.
Since 11.2.0.4 or with patched 11.2.0.3 versions of the RDBMS this ESTIMATE=STATISTICS bug is fixed and for faster Data Pump API based export operations, I would advise that ESTIMATE=STATISTICS be used, if the accuracy of the estimated size of the export dump file set is not a major concern.

ENCRYPTION=... / ENCRYPTION_ALGORITHM= ...

It is very important to take into consideration that the encrypt/decrypt operations are an overhead to the existing plans. Encryption and decryption are typically CPU intensive operations and would always require additional CPU resources. They are not generating specific wait events. The time needed to decrypt the data should not be compared to the time needed to execute a statement or read a block from disk.

EXCLUDE=COMMENT

In some cases, where there is no need to reference the comments on columns and object types by the RDBMS client software that is being used by the end users, this information can be omitted to reduce the amount of data to be processed by the Data Pump operation which in turn will reduce the time spent performing that Data Pump based operation.
If in doubt, then do not exclude COMMENT.

EXCLUDE=STATISTICS

If not needing to use the mutually exclusive INCLUDE parameter, then excluding the generation and export of statistics at export time will shorten the time needed to perform any export operation. The DBMS_STATS.GATHER_DATABASE_STATS procedure would then be used at the target database once the import operation was completed.
Data Pump operations can hang, sometimes indefinitely, when concurrent statistics generation for a same table is being performed by both the Data Pump engine and any other RDBMS session.
For long running Data Pump operations (1 hour or more), consider disabling the database's automatic statistics gathering task.

To temporarily disable the 11g RDBMS automated statistics generating task so that the Data Pump operation does not encounter contention with this task ...

As user SYS execute the following commands ...

exec DBMS_AUTO_TASK_ADMIN.DISABLE(client_name => 'auto optimizer stats collection', operation => NULL, window_name => NULL);

Then start the Data Pump operation and once the Data Pump operation has completed re-enable that task using the command below ...

exec DBMS_AUTO_TASK_ADMIN.ENABLE(client_name => 'auto optimizer stats collection', operation => NULL, window_name => NULL);

To temporarily disable the 10g RDBMS automated statistics generating task so that the Data Pump operation does not encounter contention with this task ...

As user SYS execute the following commands ...

exec sys.dbms_scheduler.disable ('GATHER_STATS_JOB');

Then start the Data Pump operation and once the Data Pump operation has completed re-enable that task using the command below ...

exec sys.dbms_scheduler.enable ('GATHER_STATS_JOB');

NETWORK_LINK=...

Performance wise, the use of this parameter further limits the effective degree of parallelism that can be used by the Data Pump API and unless your network throughput and network latency are better than a local drive, using NETWORK_LINK will be expected to be significantly slower than using export dump files for the purposes of moving data.

With respect to Data Pump API performance, because of it`s tendency to be significantly slower than a dump file based Data Pump API operation, I would only advise the use NETWORK_LINK as a last resort. I would consider using a portable or shared drive to write dump files to before considering the use of NETWORK_LINK as a means of transfering data to and from databases.

Note that rules for distributed transactions apply when using a database link, so the use of NETWORK_LINK means that the rules of distributed transactions apply to it`s use as well which means that distributed timeouts could be a concern for large data sets being moved.

PARALLEL=Number_of_CPUs

If there is more than one CPU available and the environment is not already CPU bound or disk I/O bound or memory bound and multiple dump files are going be used (ideally on different spindles) in the DUMPFILE parameter, then parallelism has the greatest potential of being used to positive effect, performance wise.
If the Data Pump parameter named PARALLEL is set to N where N > 1, then I would strongly recommend, for best use of parallelism, that the number of dump files defined by the Data Pump parameter named DUMPFILE be no less than N in quantity.

Please refer to
Note 365459.1 - Parallel Capabilities of Oracle Data Pump

NOTE:

The PARALLEL parameter is an upper limit on the number of concurrent Data Pump Worker processes that the Data Pump API based job can use, but the Data Pump API may choose to use a lower number of Data Pump Worker processes than the value assigned to that parameter.

Depending on which bottlenecks are in your host environment, it is possible that a Data Pump API based operation would be overall faster using a PARALLEL value that is less than the number of CPUs available to the Oracle RDBMS instance that is spawning the Data Pump Worker processes.

QUERY

The use of the QUERY parameter is expected to add significant overhead to any Data Pump API based operation.
The overhead is expected to be proportional to the number of rows in the table(s) queried.

REMAP_*

The use of any REMAP_* parameter is expected to add significant overhead to any Data Pump API based operation.
The overhead is expected to be proportional to the number of objects to be imported or exported.

NOTE:

REMAP_SCHEMA remaps only the ownership of an object for the CREATE object_type schema.object_name ... DLL section of a DDL statement. All other references to the old schema name within a DDL statement are left unchanged. Therefore, it is not unexpected that user defined PL/SQL code that explicitly references the old schema name within the body of the PL/SQL code will either fail to compile or may behave unexpectedly if the old schema name exists in the target database.

For PL/SQL routines, within the bodies of those routines the REMAP_* parameter will have no effect. All explicit references to the old name values of object types to be remapped will remain unchanged within a PL/SQL body.

At the Oracle RDBMS instance level:

_LM_SHARE_LOCK_RESTYPE

In 12.2 RAC environments database lock code optimizations can result in degraded performance for Data Pump import operations for PL/SQL based objects such as functions, packages and procedures. This new (in 12.2) optimization can be disabled with the command: ALTER SYSTEM SET _lm_share_lock_restype='' SCOPE=SPFILE SID='*';

_MEMORY_BROKER_STAT_INTERVAL

If resize operations are a significant time consumer in your slow Data Pump environment, then setting this parameter will reduce the frequency of resize operations which in turn will reduce the overall amount of time that resize operations delay other operations over a given time span.
Data Pump operations are particularly vulnerable to being delayed because of frequent streams pool resize operations.
The default value for this parameter is 30 (30 seconds) and the maximum value for this parameter is 999 (999 seconds).
This is because the Data Pump API depends a lot on the streams feature of the Oracle RDBMS to help perform export and import operations.
I would recommend a value of 999 for this parameter, if the parameter named STREAMS_POOL_SIZE is already set explicitely in value by the DBA (which gives the streams pool a minimum size) and frequent resize operations are occurring in your slow Data Pump environment.

_OPTIMIZER_GATHER_STATS_ON_LOAD=TRUE

For import operations only, 12c adds this new parameter that can slow down import operations when left at it's default setting of TRUE.
With 12c, if you intend on using EXCLUDE=STATISTICS for an import operation, then also set _OPTIMIZER_GATHER_STATS_ON_LOAD=FALSE at the database level.

   ALTER SYSTEM SET "_OPTIMIZER_GATHER_STATS_ON_LOAD"=FALSE;
   Perform the Data Pump based import operation with the EXCLUDE=STATISTICS setting.
   ALTER SYSTEM SET "_OPTIMIZER_GATHER_STATS_ON_LOAD"=TRUE;
   It would be recommended, given the above scenario, to manually gather database statistics after that import operation has completed successfully..
   EXEC DBMS_STATS.GATHER_DATABASE_STATS;

AQ_TM_PROCESSES=0

When set explicitly to a value of 0, it can negatively impact the speed of Advanced Queue operations, which in turn can negatively affect Data Pump engine based operations since Data Pump uses Advanced Queueing.
Either leave this parameter unset or set this parameter to a value that is greater than 0.

DEFERRED_SEGMENT_CREATION=TRUE

For import operations only, the time spent performing space allocation for empty tables is eliminated. There is no expected significant performance benefit from setting this parameter to a value of TRUE from a data Pump API based export operation perspective.
This parameter is most useful when creating tables for an 11.2.0.2 or higher versioned environment where it is likely that a significant number of tables to be created will never be used.

WARNING:

Setting this parameter to TRUE will break backward compatibility, if you need to transfer table DDL metadata from an 11.2.0.2 Oracle RDBMS database or higher versioned Oracle RDBMS database to a pre-11.2.0.2 versioned Oracle RDBMS database where this functionality was not supported!

Please refer to:
Note 1087325.1 - Error ORA-439 When Importing Tables Created With Enabled Deferred Segment Into Oracle 11g Standard Edition
Note 11812675.8 - Bug 11812675 - Corrupt .dmp file / IMP-9 from direct path "exp" of no-segment table (conventional export)
Unpublished bug 8833245 - EXPORT IS SILENTLY DISCARDING NON-MATERIALIZED TABLES IN SCHEMA/DB MODE

Pre-11.2.0.2 Classic engine based import utilities, if unpatched, will not properly handle the CREATE DDL of tables that use that functionality and will likely raise an unexpected error when importing an empty table that was created while DEFERRED_SEGMENT_CREATION=TRUE.

Note that the database initialization parameter named DEFERRED_SEGMENT_CREATION when set to TRUE is not supported for use by Oracle RDBMS versions below 11.2.0.2.

FILESYSTEMIO_OPTION=...

In the specific case where the RDBMS instance will be writting to an ACFS file system, given the nature of the types of write operations performed by the Data Pump API as part of an export operation, values other than NONE for this parameter can result in slow export operations.

Reference: Bug 18253632 "WRITING EXPDP TO ACFS IS 5X SLOWER THAN WRITING ASM DISKGROUP/LINUX NATIVE FS" which was closed as not a bug.

FIXED_DATE=...

Looking at note "Init.ora Parameter "FIXED_DATE" Reference Note" (Doc ID 30724.1) it looks to me like FIXED_DATE has never been completely supported by all features of the Oracle RDBMS, given that there are Oracle RDBMS feature compatibility bugs from 7.3.4.5 up to 11.2.

Whether they are called bugs or limitations, whether or not an RDBMS feature is fully compatible with the FIXED_DATE parameter being set or not appears to be on an RDBMS feature-by-feature basis.
In the case of FIXED_DATE and the Data Pump API or perhaps PL/SQL in general, the end result may or may not be a hang.

NLS_CHARACTERSET=... and NLS_NCHAR_CHARACTERSET=...

Differences in character sets used by a source and a target database that is being imported into will result in character set conversion. That character set conversion does have an additional overhead that more or less maps to the proportion of character data being imported versus non-character data being imported.

NLS_COMP=... and NLS_SORT=...

In some rare cases, having both of the database initialization parameters set to a value of BINARY will result in a significantly faster Data Pump API based operation.
Whether or not these parameters being set to BINARY will help performance in your environment has to be tested in your environment as the root cause of performance increase has not been investigated.

The after logon trigger listed below sets these two above mentioned NLS parameters at the session level only for an RDBMS session that logged into the database instance via an RDBMS connection that was established by a Data Pump utility.

CREATE OR REPLACE TRIGGER sys.<NLS_TRIGGER_NAME> AFTER LOGON ON DATABASE
DECLARE
V_MODULE VARCHAR2(60);
BEGIN
SELECT SYS_CONTEXT ('USERENV', 'MODULE') INTO V_MODULE FROM DUAL;
IF UPPER(V_MODULE) LIKE 'UDE%'
THEN
BEGIN
EXECUTE IMMEDIATE 'ALTER SESSION SET NLS_COMP=''BINARY''';
EXECUTE IMMEDIATE 'ALTER SESSION SET NLS_SORT=''BINARY''';
END;
END IF;
END;
/

NOTE:
Logon triggers are ignored by Data Pump for releases < 11.2.
Please refer to:
Bug 5724679 - LOGON TRIGGER NOT FIRED FOR DATA PUMP MASTER AND WORKER PROCESSES
A few interim patches are available (for 10.2.0.4 and 11.1.0.7).

PARALLEL_FORCE_LOCAL=TRUE

In a RAC environment it can improve the speed of Data Pump API based operations and avoid some Parallel DML bugs at the same time.

Note that the database initialization parameter named PARALLEL_FORCE_LOCAL is available only as of version 11.2.0.2 and higher of the Oracle RDBMS software.

STATISTICS_LEVEL=TYPICAL

Because the Data Pump API executes SQL queries it has a dependency on the database using good execution plans for those queries so as to minimize the time spent waiting for result sets. This setting enables, among other things, the automatic collection of Cost Based Optimizer statistics which will be used to generate execution plans for the queries used by that API. Note that even with automatic Cost Based Optimizer statistics, the gathering is periodic in nature for the most part, and some types of transactions (table truncations for example) will leave the Cost Based Optimizer statistics unchanged. Therefore in some cases, because of out-of-date Cost Based Optimizer statistics, an export operation wull be significantly slower to complete than it was in the past given a very similar data set. When troubleshooting a slow export operation, one step would be to ensure that database, dictionary and fixed object statistics are all up-to-date prior to the next execution of the export operation that was previously much slower than usual.

EXEC DBMS_STATS.GATHER_DATABASE_STATS;
EXEC DBMS_STATS.GATHER_DICTIONARY_STATS;

Additional objects that can affect Data Pump performance but which are not currently handled by the GATHER_DICTIONARY_STATS routine.

EXEC DBMS_STATS.GATHER_INDEX_STATS('SYS','I_FILE#_BLOCK#');
EXEC DBMS_STATS.GATHER_INDEX_STATS('SYS','I_OBJ2');
EXEC DBMS_STATS.GATHER_INDEX_STATS('SYS','I_DEPENDENCY2');
EXEC DBMS_STATS.GATHER_INDEX_STATS('SYS','I_OBJ1');
EXEC DBMS_STATS.GATHER_INDEX_STATS('SYS','I_OBJ#');

Only as a last resort should the fixed object statistics be updated.

EXEC DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;

STREAMS_POOL_SIZE

A minimum of 256MB-350MB set for streams pool should be guaranteed during startup, to have an initial pool allocated, otherwise a significant delay might be seen when using Data Pump.
Define minimum streams_pool_size value during startup, e.g set streams_pool_size=256M in pfile/spfile.

At the table DDL level:

The use of DEDUPLICATED LOB columns and Data Pump's PARALLEL parameter with a value greater than 1

Currently, as of 12.2, the use of the deduplication feature of LOB columns has a limitation on the use of parallelism to access LOB contents which results in a high risk of a deadlock or a hang while the Data Pump API is processing tables using that feature, if the Data Pump API's PARALLEL parameter is set to a value greater than 1.
Documentation bug 26321833 "DOCUMENT DEDUPLICATED-LOB RESTRICTIONS" which is not published.

NETWORK_LINK + SECUREFILES

The NETWORK_LINK parameter when used to move tables with LOB columns that are configured to use SECUREFILES make for an extremely slow transfer of that LOB data when compared to exporting the LOB containing tables to a file, and it has been observed that using the NETWORK_LINK parameter to move tables with LOB columns defined to use securefiles generates a lot of undo data.
Part of the cause of the difference in speed as of 11.2.0.3 is that distributed transactions is that allocation requests are limited to one block at a time over a database link which means more round trips to get larger data sets transferred.

SECUREFILES (Without NETWORK_LINK)

The storage of LOB column data using the SecureFiles storage format does not allow tables with LOB columns to be exported and imported using parallelism for RDBMS version below 12.2.
The storage of LOB column data using the BasicFiles storage format does not allow tables with LOB columns to be exported or imported using parallelism.

At the table DML level:

Contention between the Data Pump operation and another RDBMS session for access to RDBMS objects (Typically for locks on table row data).

The Data Pump engine during an export operation will wait for most types of locks on rows and tables that are held by other RDBMS sessions to be released before it will proceed to export or import rows from the associated table.
The Data Pump engine during an export operation does wait for some types of locks on rows and tables that are held by other RDBMS sessions that an export performed by the classic export utility did not wait for.
Therefore exporting a table that is being actively updated will be expected to be slower to do than exporting that same table when it is not being updated.

Things that a Data Pump operation has no control over which affect the speed of Data Pump operations:

Compilation speed of Java Stored Procedures during an import operation.
Compilation speed of PL/SQL packages during an import operation.
Constraint creation speed (Foreign key and Primary key) during an import operation.
Index creation speed during an import operation.
The speed at which the creation DDL of an object is processed if that object's creation DDL has dependencies on remote objects.
An undersized SGA.
Any lack of operating system or hardware resources not due to an excessive use of said resources by the Data Pump operation.
Locks already held by other RDBMS sessions on RDBMS objects that the Data Pump operation also wants to have access to.
If using the NETWORK_LINK parameter, the Data Pump API has no control over the distributed transaction timeout value.

----------------------------------------

SYMPTOMS

Partitioned tables have been created prior to the Import with IMPDP and the PARALLEL option is being used but only one of the workers is active and the others are idle resulting in a slow import compared to the export time.

The tables involved will show in the trace:

INSERT /*+ APPEND PARALLEL */

Using "data_options=disable_append_hint" does not change the "Append" hint being used.

CAUSE

IMPDP is working as designed: In order to use Parallel, let IMPDP create the Partitioned tables itself.

According to Oracle documentation
PARTITION_OPTIONS

"Parallel processing during import of partitioned tables is subject to the following:

If a partitioned table is imported into an existing partitioned table, then Data Pump only processes one partition or subpartition at a time, regardless of any value that might be specified with the PARALLEL parameter.

If the table into which you are importing does not already exist and Data Pump has to create it, then the import runs in parallel up to the parallelism specified on the PARALLEL parameter when the import is started.

SOLUTION

Let IMPDP create the Partitioned Tables (rather than creating them before the import) if you wish to use the Parallel option with IMPDP to speed up the Import process.

Use the following parameter:

DATA_OPTIONS = TRUST_EXISTING_TABLE_PARTITIONS

This flag allows the data to be loaded in parallel.

APPLIES TO:

Oracle Database - Enterprise Edition - Version 10.2.0.4 to 11.2.0.1 [Release 10.2 to 11.2]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Information in this document applies to any platform.

SYMPTOMS

When running IMPDP against an existing partitioned table with the TABLE_EXISTS_ACTION command line parameter set to TRUNCATE or APPEND and also using parallellism, the worker processes seem to be idle and waiting for the following event:

enq: TM - contention

The problem does not occur when the table does not yet exist in the database, or when the IMPDP operation is run in serial mode.

CHANGES

CAUSE

The cause of this problem has been identified in Bug 7197735, closed as duplicate of unpublished Bug 8448327. It is caused by multiple Datapump worker processes being assigned to a single partition, causing one worker process to be able to write to the partition, and the other worker processes waiting to get an exclusive lock on the same partition.

SOLUTION

This issue has been fixed in Oracle12c release and the 11.2.0.2 patchset.

Workarounds for the other Oracle versions are:

apply Patch 8448327 (for Generic platform) if available for your Oracle version

- OR -
run IMPDP in serial mode (so not using the PARALLEL command line parameter)

- OR -
drop the table prior to running IMPDP

SYMPTOMS

When importing partitioned tables into existing partitioned tables with PARALLEL > 1 there will be Worker processes waiting for an exclusive table lock which is indicated by event "enq: TM - contention".

To confirm the issue, execute the following queries during the Impdp client session reports:

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA

-- DW process which is holding the exclusive lock
select a.sid, a.type, a.id1, b.program
from   v$lock a, v$session b
where a.sid = b.sid and
       a.type = 'TM' and
       a.lmode = 6 and
       b.program like '%DW%';

-- DW process(es) requesting an exclusive lock
select a.sid, a.type, a.id1, b.program, b.event
from   v$lock a, v$session b
where a.sid = b.sid and
       a.type = 'TM' and
       a.request = 6 and
       b.program like '%DW%';

-- Affected table
select c.owner, c.object_name, b.program, b.event
from   v$lock a, v$session b, dba_objects c
where a.sid = b.sid and
       a.type = 'TM' and
       a.id1 = c.object_id and
       a.request = 6 and
       b.program like '%DW%';

NOTE: In the testcase content below, the user information and metadata used represents fictitious data. Any similarity to actual persons, living or dead, is purely coincidental and not intended in any manner.

The following example will illustrate the scenario:

impdp tc1/****** DIRECTORY=test_dp DUMPFILE=test%U.dmp tables=tc1.t1 TABLE_EXISTS_ACTION=APPEND PARALLEL=10

Or:

impdp tc1/****** DIRECTORY=test_dp DUMPFILE=test%U.dmp tables=tc1.t1 TABLE_EXISTS_ACTION=TRUNCATE PARALLEL=10

-- There will be a single DW process which is holding an exclusive lock on the table

-- DW process which is holding the exclusive lock
select a.sid, a.type, a.id1, b.program
from   v$lock a, v$session b
where a.sid = b.sid and
       a.type = 'TM' and
       a.lmode = 6 and
       b.program like '%DW%';

SID TY ID1        PROGRAM
----- -- ---------- -------------------------
115 TM     282361 ORACLE.EXE (DW02)

-- Several other worker processes are waiting until the exclusive lock is released.

-- DW process(es) requesting an exclusive lock
select a.sid, a.type, a.id1, b.program, b.event
from   v$lock a, v$session b
where a.sid = b.sid and
       a.type = 'TM' and
       a.request = 6 and
       b.program like '%DW%';

SID TY ID1        PROGRAM                EVENT
----- -- ---------- ---------------------- ------------------------
116 TM     282361 ORACLE.EXE (DW07)      enq: TM - contention
136 TM     282361 ORACLE.EXE (DW03)      enq: TM - contention
138 TM     282361 ORACLE.EXE (DW09)      enq: TM - contention
152 TM     282361 ORACLE.EXE (DW04)      enq: TM - contention

-- The query will report the affected table
select c.owner, c.object_name, b.program, b.event
from   v$lock a, v$session b, dba_objects c
where a.sid = b.sid and
       a.type = 'TM' and
       a.id1 = c.object_id and
       a.request = 6 and
       b.program like '%DW%';

OWNER OBJECT_NAME          PROGRAM                 EVENT
----- -------------------- ----------------------- ---------------------------
TC1   T1                   ORACLE.EXE (DW04)       enq: TM - contention
TC1   T1                   ORACLE.EXE (DW09)       enq: TM - contention
TC1   T1                   ORACLE.EXE (DW03)       enq: TM - contention
TC1   T1                   ORACLE.EXE (DW07)       enq: TM - contention

Note: A TM table lock is usually acquired during the execution of a transaction when referencing a table with a DML statement to prevent the object from being dropped or altered during the execution of the transaction.

CHANGES

This issue occurs if the following is true:

- Imported table is a partitioned table
- Data Pump Import (Impdp) uses parameter PARALLEL > 1 and TABLE_EXISTS_ACTION=APPEND or TRUNCATE

CAUSE

This is produced by unpublished Bug 8448327.

SOLUTION

The issue is fixed in release 11.2.0.2 and 12.1. There is no workaround.

Please check, if patch 8448327 is available for your RDBMS version and platform.

Important

Please read the README file delivered with the patch. There are post-install steps to be completed, otherwise the fix won't work.

----------------

SYMPTOMS

Extremely slow performance is observed while running an Import DataPump (IMPDP) job against a highly partitioned table with 143 partitions. The import parameter file looks like:

The SQL found in DW00 generated trace file has the following form:

INSERT /*+ APPEND PARALLEL("<table name>",<parallel degree>)+*/ INTO RELATIONAL("<schema name>"."<table name>") ...PARTITION <partition_name>
FROM "ET$...." KU$

The DataPump job was loading only approx 5% (or approx 5GB) per hour, on a 700GB table. The performance issue is introduced by the fact that the DML workers processes are not running in parallel even after setting up PARALLEL DML and PARALLEL=8.
Even if the parallel slaves are being launched:

PROGRAM   COMMAND   Username   QC/Slave   Slave Set   SID   SERIAL#   QC      SID   REQ_DOP   ACT_DOP
============================================================================================================
oracle <SERVICE_NAME> (P000)   2   - p000   (Slave)   1   787   10351   1044   8   8
oracle <SERVICE_NAME> (P001)   2   - p001   (Slave)   1   853   1203   1044   8   8
oracle <SERVICE_NAME> (P002)   2   - p002   (Slave)   1   915   50799   1044   8   8
oracle <SERVICE_NAME> (P003)   2   - p003   (Slave)   1   981   11441   1044   8   8
oracle <SERVICE_NAME> (DW00)   2       SYS   QC       1044   56901   1044
oracle <SERVICE_NAME> (P000)   2   - p000   (Slave)   1   788   12009   1044   8   8
oracle <SERVICE_NAME> (P001)   2   - p001   (Slave)   1   850   50447   1044   8   8
oracle <SERVICE_NAME> (P002)   2   - p002   (Slave)   1   917   9857   1044   8   8
oracle <SERVICE_NAME> (P003)   2   - p003   (Slave)   1   980   39721   1044   8   8

the problem is that PARALLEL DML is disabled thus all of those parallel slaves are doing nothing on the INSERT. From below you can very clearly see that the PDML_STATUS is DISABLED:

select *from gv$session 里面的PDML_STATUS

INST_ID LOGON_TIME USERNAME STATUS PDML_STATUS PROGRAM SQL_ID
===================================================================================================
1 11/4/2014 16:17:03 A11P11 SYS ACTIVE ENABLED udi@<SERVICE_NAME> (TNS V1-V3) 7wn3wubg7gjds
2 11/4/2014 16:17:01 A11P11 SYS ACTIVE DISABLED oracle@<<SERVICE_NAME> (P001) asmy11x7ya7ws
2 11/4/2014 16:17:01 A11P11 SYS ACTIVE DISABLED oracle@<SERVICE_NAME>g (P000) asmy11x7ya7ws
2 11/4/2014 16:17:01 A11P11 SYS ACTIVE DISABLED oracle@<SERVICE_NAME> (P003) asmy11x7ya7ws
2 11/4/2014 16:17:01 A11P11 SYS ACTIVE DISABLED oracle@<SERVICE_NAME> (P002) asmy11x7ya7ws
1 11/4/2014 16:17:04 A11P11 SYS ACTIVE ENABLED oracle@<SERVICE_NAME> (DW00) asmy11x7ya7ws
1 11/4/2014 16:17:07 A11P11 SYS ACTIVE DISABLED oracle@<SERVICE_NAME> (P001) asmy11x7ya7ws
1 11/4/2014 16:17:07 A11P11 SYS ACTIVE DISABLED oracle@<SERVICE_NAME>g (P000) asmy11x7ya7ws
1 11/4/2014 16:17:07 A11P11 SYS ACTIVE DISABLED oracle@<SERVICE_NAME> (P003) asmy11x7ya7ws
1 11/4/2014 16:17:07 A11P11 SYS ACTIVE DISABLED oracle@<SERVICE_NAME> (P002) asmy11x7ya7ws
1 11/4/2014 16:17:04 A11P11 SYS ACTIVE ENABLED oracle@<SERVICE_NAME> (DM00) bjf05cwcj5s6p

As per worker trace, it is confirmed that the insert operation is loading data serially :

这里是因为这两个，但是select是并行的

---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                        | Name            | Rows | Bytes | Cost (%CPU)| Time     |    TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |                 | 8168 | 4817K|     4   (0)| 00:00:01 |        |      |            |
|   1 | LOAD AS SELECT                  | <NAME>     |       |       |            |          |        |      |            |
|   2 |   ERROR LOGGING                  | <NAME>     |       |       |            |          |        |      |            |
|   3 |    PX COORDINATOR                |                 |       |       |            |          |        |      |            |
|   4 |     PX SEND QC (RANDOM)          | :TQ10000        | 8168 | 4817K|     4   (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
|   5 |      ERROR LOGGING QC (RANDOM)   | :TQ10000        |       |       |     4   (0)| 00:00:01 | Q1,00 | PCWP | QC (RAND) |
|   6 |       PX BLOCK ITERATOR          |                 | 8168 | 4817K|     4   (0)| 00:00:01 | Q1,00 | PCWC |            |
|   7 |        EXTERNAL TABLE ACCESS FULL| ET$03D4837F0001 | 8168 | 4817K|     4   (0)| 00:00:01 | Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------------------

and the query coordinator(QC) is running alter session to enable the parallel dml:

KUPW:10:47:01.266: 0: ALTER SESSION ENABLE PARALLEL DML called.
KUPW:10:47:01.266: 0: ALTER SESSION ENABLE PARALLEL DML returned.

but as per PQ tracing the automatic degree of parallelism is disabled:

.....
kkfdtParallel: parallel is possible (no statement type restrictions)
kkfdPaPrm: use table level hint DOP(8) on table
kkfdPaPrm:- The table : 154053
kkfdPaPrm:DOP = 8 (computed from hint/dictionary/autodop)
kkfdiPaPrm: dop:8 parallel(?)
PDML disabled in session
kkfdtpddldml: isParallel ? No

This information shows that the insert is being serialized as a result of PARALLEL DML being disabled and the performance issue is introduced by this behavior.

CHANGES

CAUSE

This is an impdp performance issue where an import with external table access_method is not running the job in parallel even after starting the PQ slave correctly.
The performance issue is only hit when running the job with SYSDBA user. The root cause of this issue has been determined in Bug 19520061 - IMPDP: EXTREMELY SLOW IMPORT FOR A PARTITIONED TABLE.

SOLUTION

1. Upgrade the database to 12.2, when available, where issue is fixed.

- OR -

2. For earlier database releases please check Patch 19520061, if available for your platform and RDBMS version.

- OR -

3. Run the DataPump import job with an user other than SYS.

----------------ACCESS_METHOD =Direct Path mode.对于大表效果不好没有并行

Applies to:

Oracle Database - Enterprise Edition - Version 10.1.0.2 to 12.2.0.1 [Release 10.1 to 12.2]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Gen 1 Exadata Cloud at Customer (Oracle Exadata Database Cloud Machine) - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.

Goal

Starting with Oracle10g, Oracle Data Pump can be used to move data in and out of a database. Data Pump can make use of different methods to move the data, and will automatically choose the fastest method. It is possible though, to manually enforce a specific method. This document demonstrates how to specify the method with which data will be loaded or unloaded with Data Pump.

Solution

1. Introduction.

Data Pump can use four mechanisms to move data in and out of a database:

Data file copying;
Direct path;
External tables;
Network link import.

The two most commonly used methods to move data in and out of databases with Data Pump are the "Direct Path" method and the "External Tables" method.

1.1. Direct Path mode.
After data file copying, direct path is the fastest method of moving data. In this method, the SQL layer of the database is bypassed and rows are moved to and from the dump file with only minimal interpretation. Data Pump automatically uses the direct path method for loading and unloading data when the structure of a table allows it.

1.2. External Tables mode.
If data cannot be moved in direct path mode, or if there is a situation where parallel SQL can be used to speed up the data move even more, then the external tables mode is used. The external table mechanism creates an external table that maps the dump file data for the database table. The SQL engine is then used to move the data. If possible, the APPEND hint is used on import to speed the copying of the data into the database.
Note: When the Export NETWORK_LINK parameter is used to specify a network link for an export operation, a variant of the external tables method is used. In this case, data is selected from across the specified network link and inserted into the dump file using an external table.

1.3. Data File Copying mode.
This mode is used when a transport tablespace job is started, i.e.: the TRANSPORT_TABLESPACES parameter is specified for an Export Data Pump job. This is the fastest method of moving data because the data is not interpreted nor altered during the job, and Export Data Pump is used to unload only structural information (metadata) into the dump file.

1.4. Network Link Import mode.
This mode is used when the NETWORK_LINK parameter is specified during an Import Data Pump job. This is the slowest of the four access methods because this method makes use of an INSERT SELECT statement to move the data over a database link, and reading over a network is generally slower than reading from a disk.

The "Data File Copying" and "Network Link Import" methods to move data in and out of databases are outside the scope of this article, and therefore not discussed any further.

For details about the access methods of the classic export client (exp), see:
Note:155477.1 "Parameter DIRECT: Conventional Path Export Versus Direct Path Export"

2. Export Data Pump: unloading data in "Direct Path" mode.

Export Data Pump will use the "Direct Path" mode to unload data in the following situations:

EXPDP will use DIRECT_PATH mode if:

2.1. The structure of a table allows a Direct Path unload, i.e.:
   - The table does not have fine-grained access control enabled for SELECT.
     - The table is not a queue table.
     - The table does not contain one or more columns of type BFILE or opaque, or an object type containing opaque columns.
   - The table does not contain encrypted columns.
   - The table does not contain a column of an evolved type that needs upgrading.
   - If the table has a column of datatype LONG or LONG RAW, then this column is the last column.

2.2. The parameters QUERY, SAMPLE, or REMAP_DATA parameter were not used for the specified table in the Export Data Pump job.

2.3. The table or partition is relatively small (up to 250 Mb), or the table or partition is larger, but the job cannot run in parallel because the parameter PARALLEL was not specified (or was set to 1).

Note that with an unload of data in Direct Path mode, parallel I/O execution Processes (PX processes) cannot be used to unload the data in parallel (paralllel unload is not supported in Direct Path mode).

3. Export Data Pump: unloading data in "External Tables" mode.

Export Data Pump will use the "External Tables" mode to unload data in the following situations:

EXPDP will use EXTERNAL_TABLE mode if:

3.1. Data cannot be unloaded in Direct Path mode, because of the structure of the table, i.e.:
   - Fine-grained access control for SELECT is enabled for the table.
   - The table is a queue table.
   - The table contains one or more columns of type BFILE or opaque, or an object type containing opaque columns.
   - The table contains encrypted columns.
   - The table contains a column of an evolved type that needs upgrading.
   - The table contains a column of type LONG or LONG RAW that is not last.

3.2. Data could also have been unloaded in "Direct Path" mode, but the parameters QUERY, SAMPLE, or REMAP_DATA were used for the specified table in the Export Data Pump job.

3.3. Data could also have been unloaded in "Direct Path" mode, but the table or partition is relatively large (> 250 Mb) and parallel SQL can be used to speed up the unload even more.

Note that with an unload of data in External Tables mode, parallel I/O execution Processes (PX processes) can be used to unload the data in parallel. In that case the Data Pump Worker process acts as the coordinator for the PX processes. However, this does not apply when the table has a LOB column: in that case the table parallelism will always be 1. This restriction has been lifted in Oracle12c 12.1.0.1. See also:
Bug:5943346 "PRODUCT ENHANCEMENT: PARALLELISM OF DATAPUMP JOB ON TABLE WITH LOB COLUMN"
Bug:12866324 "34708: DATA PUMP PREVENTS PARALLEL OPTS AND DPAPI LOADS THAT SHOULD BE ALLOWED" (not a public bug), fixed in 12.1.0.1 (cannot be backported to earlier releases)

4. Import Data Pump: loading data in "Direct Path" mode.

Import Data Pump will use the "Direct Path" mode to load data in the following situations:

IMPDP will use DIRECT_PATH if:

4.1. The structure of a table allows a Direct Path load, i.e.:
   - A global index does not exist on a multipartition table during a single-partition load. This includes object tables that are partitioned.
   - A domain index does not exist for a LOB column.
   - The table is not in a cluster.
   - The table does not have BFILE columns or columns of opaque types.
   - The table does not have VARRAY columns with an embedded opaque type.
   - The table does not have encrypted columns.
   - Supplemental logging is not enabled or supplemental logging is enabled and the table does not have a LOB column.
   - The table into which data is being imported is a pre-existing table and:
      – There is not an active trigger, and:
      – The table is not partitioned, and:
      – Fine-grained access control for INSERT mode is not enabled, and:
      – A constraint other than table check does not exist, and:
      – A unique index does not exist.

4.2 The parameters QUERY, REMAP_DATA parameter were not used for the specified table in the Import Data Pump job.

4.3. The table or partition is relatively small (up to 250 Mb), or the table or partition is larger, but the job cannot run in parallel because the parameter PARALLEL was not specified (or was set to 1).

5. Import Data Pump: loading data in "External Tables" mode.

Import Data Pump will use the "External Tables" mode to load data in the following situations:

IMPDP will use EXTERNAL_TABLE if:

5.1. Data cannot be loaded in Direct Path mode, because at least one of the following conditions exists:
   - A global index on multipartition tables exists during a single-partition load. This includes object tables that are partitioned.
   - A domain index exists for a LOB column.
   - A table is in a cluster.
   - A table has BFILE columns or columns of opaque types.
   - A table has VARRAY columns with an embedded opaque type.
   - The table has encrypted columns.
   - Supplemental logging is enabled and the table has at least one LOB column.
   - The table into which data is being imported is a preexisting table and at least one of the following conditions exists:
       - There is an active trigger
       - The table is partitioned
       - Fine-grained access control is in insert mode
       - A referential integrity constraint exists
       - A unique index exists

5.2. Data could also have been loaded in "Direct Path" mode, but the parameters QUERY, or REMAP_DATA were used for the specified table in the Import Data Pump job.

5.3. Data could also have been loaded in "Direct Path" mode, but the table or partition is relatively large (> 250 Mb) and parallel SQL can be used to speed up the load even more.

Note that with a load of data in External Tables mode, parallel I/O execution Processes (PX processes) can be used to load the data in parallel. In that case the Data Pump Worker process acts as the coordinator for the PX processes. However, this does not apply when the table has a LOB column: in that case the table parallelism will always be 1. This restriction has been lifted in Oracle12c 12.1.0.1. See also:
Bug:5943346 "PRODUCT ENHANCEMENT: PARALLELISM OF DATAPUMP JOB ON TABLE WITH LOB COLUMN"
Bug:12866324 "34708: DATA PUMP PREVENTS PARALLEL OPTS AND DPAPI LOADS THAT SHOULD BE ALLOWED" (not a public bug), fixed in 12.1.0.1 (cannot be backported to earlier releases)

6. How to enforce a specific load/unload method ?

In very specific situations, the parameter ACCESS_METHOD can be used to enforce a specific method to unload or load the data. Example:

%expdp system/<password> ... ACCESS_METHOD=DIRECT_PATH
%expdp system/<password> ... ACCESS_METHOD=EXTERNAL_TABLE

or:

%impdp system/<password> ... ACCESS_METHOD=DIRECT_PATH
%impdp system/<password> ... ACCESS_METHOD=EXTERNAL_TABLE

Important Need-To-Know's when the parameter ACCESS_METHOD is specified for a job:

In Oracle 11gR1 and before, the parameter ACCESS_METHOD should only be used when requested by Oracle Support.
Note:
With Oracle 11gR2, the parameter ACCESS_METHOD has been documented.
Please refer to Oracle® Database Utilities
11g Release 2 (11.2)
E22490-05
If the parameter is not specified, then Data Pump will automatically choose the best method to load or unload the data.
If import Data Pump cannot choose due to conflicting restrictions, an error will be reported:
ORA-31696: unable to export/import TABLE_DATA:"<SCHEMA_NAME>"."<TABLE_NAME>" using client specified AUTOMATIC method
The parameter can only be specified when the Data Pump job is initially started (i.e. the parameter cannot be specified when the job is restarted).
If the parameter is specified, the method of loading or unloading the data is enforced on all tables that need to be loaded or unloaded with the job.
Enforcing a specific method may result in a slower performance of the overall Data Pump job, or errors such as:

...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-31696: unable to export/import TABLE_DATA:"<SCHEMA_NAME>"."<TABLE_NAME>" using client specified DIRECT_PATH method
...

To determine which access method is used, a Worker trace file can be created, e.g.:

%expdp system/<PASSWORD> DIRECTORY=my_dir \
DUMPFILE=expdp_s.dmp LOGFILE=expdp_s.log \
TABLES="<SCHEMA_NAME>"."<TABLE_NAME>" TRACE=400300

The Worker trace file shows the method with which the data was loaded (or unloaded for Import Data Pump):

...
KUPW:14:57:14.289: 1: object: TABLE_DATA:"<SCHEMA_NAME>"."<TABLE_NAME>"
KUPW:14:57:14.289: 1: TABLE_DATA:"<SCHEMA_NAME>"."<TABLE_NAME>" external table, parallel: 1
...