python 调用sqldr_sqlldr并发

最新推荐文章于 2022-05-22 16:40:04 发布

weixin_39797381

最新推荐文章于 2022-05-22 16:40:04 发布

阅读量298

点赞数

文章标签： python 调用sqldr

本文链接：https://blog.csdn.net/weixin_39797381/article/details/111422697

版权

sage: SQLLDR keyword=value [,keyword=value,...]

部分关键字:

userid -- ORACLE username/password

control -- 控制文件

log -- 记录的日志文件

bad -- 坏数据文件

data -- 数据文件

discard -- 丢弃的数据文件

discardmax -- 允许丢弃数据的最大值 (默认全部)

skip -- Number of logical records to skip (默认0)

load -- Number of logical records to load (默认全部)

errors -- 允许的错误记录数 (默认50)

rows --(每次提交的记录数，默认: 常规路径 64, 直接路径全部，所以使用直接路径的话，效率会比普通的好太多太多)

bindsize --( 每次提交记录的缓冲区的大小，字节为单位，默认256000)

silent -- 禁止输出信息 (header,feedback,errors,discards,partitions)

direct -- 使用直通路径方式导入 (默认FALSE)

parfile -- parameter file: name of file that contains parameter specifications

parallel -- 并行导入

1) ROWS 的默认值为 64，你可以根据实际指定更合适的 ROWS 参数来指定每次提交记录数。(体验过在 PL/SQL Developer 中一次执行几条条以上的 insert 语句的情形吗？)

2)常规导入可以通过使用 INSERT语句来导入数据。Direct导入可以跳过数据库的相关逻辑(DIRECT=TRUE)，而直接将数据导入到数据文件中，可以提高导入数据的性能。当然，在很多情况下，不能使用此参数(如果主键重复的话会使索引的状态变成UNUSABLE!)。

3) 通过指定 UNRECOVERABLE选项，可以关闭数据库的日志(是否要 alter table table1 nologging 呢?)。这个选项只能和 direct 一起使用。

4) 对于超大数据文件的导入就要用并发操作了，即同时运行多个导入任务.

sqlldr userid=/ control=result1.ctl direct=true parallel=true

sqlldr userid=/ control=result2.ctl direct=true parallel=true

当加载大量数据时(大约超过10GB)，最好抑制日志的产生：

SQL>ALTER TABLE RESULTXT nologging;

这样不产生REDO LOG，可以提高效率。然后在 CONTROL 文件中 load data 上面加一行：unrecoverable，此选项必须要与DIRECT共同应用。

Maximizing SQL*Loader Performance

SQL*Loaderis flexible and offers many options that should be considered to maximize the speed of data loads. These include many permutations of the SQL*Loader control file parameters:

OPTIONS (DIRECT=TRUE, ERRORS=50, rows=500000)

UNRECOVERABLE LOAD DATA

-Use Direct Path Loads - The conventional path loader essentially loads the data by using standard insert statements. The direct path loader (direct=true) loads directly into the Oracle data files and creates blocks in Oracle database block format. To prepare the database for direct path loads, the script$ORACLE_HOME/rdbms/admin/catldr.sql.sql must be executed.

-Disable Indexes and Constraints.For conventional data loads only, the disabling of indexes and constraints can greatly enhance the performance of SQL*Loader.The skip_index_maintenance SQL*Loader parameter allows you to bypass index maintenance when performing parallel build data loads into Oracle, but only when using the sqlldr direct=ydirect load options.

According to Dave More in his book 'Oracle Utilities' usingskip_index_maintenance=true means 'don't rebuild indexes', and it will greatly speed-up sqlldr data loads when using parallel processes with sqlldr:

Also, according to Oracle expert Jonathan Gennick "Theskip_index_maintenance SQL*Loader parameter: 'Controls whether or not index maintenance is done for a direct path load. This parameter does not apply to conventional path loads. A value of TRUE causes index maintenance to be skipped.

-Use a Larger Bind Array. For conventional data loads only, larger bind arrays limit the number of calls to the database and increase performance. The size of the bind array is specified using thebindsizeparameter. The bind array's size is equivalent to the number of rows it contains (rows=) times the maximum length of each row.

- Increase the input data buffer - The sqlldr readsize parameter determines the input data buffer size used by SQL*Loader

-Use ROWS=n to Commit Less Frequently.For conventional data loads only, rows specifies the number of rows per commit. Issuing fewer commits will enhance performance.

-Use Parallel Loads.Available with direct path data loads only, this option allows multiple SQL*Loader jobs to execute concurrently. Note: You must be on an SMP server (cpu_count > 2 at least) to successfully employ parallelism, and you must also employ the append option, else you may get this error: "SQL*Loader-279: Only APPEND mode allowed when parallel load specified."

Note that you can also run SQL*Loader in parallel, and create parallel parallelism:

$ sqlldr control=first.ctl parallel=true direct=true

$ sqlldr control=second.ctl parallel=true direct=true

6. Use Fixed Width Data.Fixed width data format saves Oracle some processing when parsing the data.

7. Disable Archiving During Load.While this may not be feasible in certain environments, disabling database archiving can increase performance considerably.

8. Use unrecoverable.The unrecoverable option (unrecoverable load data) disables the writing of the data to the redo logs. This option is available for direct path loads only.

Related SQL*Loader Articles:

有一个错误情况是

SQL*Loader-951: 调用一次/加载初始化错误

ORA-00604: 递归 SQL 级别 1出现错误

ORA-00054: 资源正忙, 但指定以 NOWAIT 方式获取资源, 或者超时失效

去掉direct=true即可

sqlldr scott/tiger control=multiplefile.ctl log=multiplefile.log bindsize=10000000 readsize=

10000000 rows=5000

bindsize和readsize是设置缓冲区大小

ctl问卷模版

options(skip=1)

unrecoverable

load data

characterset utf8

append into table MI_QUESTIONNAIRE_20161011

FIELDS TERMINATED BY","TRAILING NULLCOLS

(

Q_DATE date"yyyy_mm_dd" nullif (RECORD_DATE="null"),

RECORD_DATE date"yyyy-mm-dd hh24:mi:ss" nullif (RECORD_DATE="null"),

DBNAME,

ACOUNT_ID,

ROLE_ID,

Q_TYPE

)

sqlldr 用户名/密码@服务器 control=控制文件.ctl data=数据文件 errors=10000000 direct=true parallel=true