SQL*Loader 笔记 (二) 性能优化

最近客户有个测试项目,需要将大量的文本文件用sqlloder加载到oracle数据库中。特此做出如下实验,演练一下sqlloader的几个重要参数。

主要参考官方文档上关于redo产生的控制
Minimize use of the redo log 

One way to speed a direct load dramatically is to minimize use of the redo log. There are three ways to do this. You can disable archiving , you can specify that the load is unrecoverable , or you can set the SQL  NOLOGGING  parameter for the objects being loaded. This section discusses all methods. 


关闭归档日志模式

关闭所有实例
# srvctl stop instance -d devdb -i "devdb1,devdb2”
mount节点1
# srvctl start instance -d devdb -n node1 -o mount
关闭归档
SYS@devdb1 >alter database noarchivelog;
关闭所有实例
# srvctl stop instance -d devdb -i "devdb1,devdb2”
启动所有实例
# srvctl start instance -d devdb -i "devdb1,devdb2"



查看sqlloader所需的文件及大小

$ du -sh *
4.0K    c_stock.ctl                 <==控制文件
1021M   c_stock_large.txt   <==数据文件为大小为 1021MB将近1GB,同等大小的文件还有N个
4.0K    c_stock.log               <==输入日志文件

控制文件
$ cat c_stock.ctl
load data
into table tpcc.c_stock
append
FIELDS TERMINATED BY '|'
(
s_i_id,
s_w_id,
s_quantity,
s_dist_01,
s_dist_02,
s_dist_03,
s_dist_04,
s_dist_05,
s_dist_06,
s_dist_07,
s_dist_08,
s_dist_09,
s_dist_10,
s_data,
s_ytd,
s_order_cnt,
s_remote_cnt
)

数据文件
$ cat c_stock.txt
1|1|80|RfIBeBlKPqdPouSIcIdvFmh3|3gZUHn1oqzPvY03Df4iroZZo|x2J0ePT5VJPsUGH93ksRbjie|drvpekDBabpcQ85BFKUqepn5|zFKyOSAP7V4b3xA1wP3cMWEh|wmvyBIrYbZnNINsDxkDoG3g2|H75jUxoeJKB8gSUiI5Te
JcH4|m8iQ0mJxkEE2ZS4wqvsxccDK|8mMFmhzxgFd6QMrYeTQ1BKnQ|4Fb6GEEEOg9ZnwmtZnnFYEt0|7SLI5rAzsw5U0zy6wnhQxBFLM9zfspqq71W1imqyIjiwGHq|0|0|0
2|1|71|SxGgW9Rz6eCprSDpKneQzsf5|znr5SFVB2pHPmmcjoEwFmXSU|aWzxcEsBPHuycfX2ssFDCHMQ|96lkT22ROrePTuez1zV1EGRV|YnoqSY4SUf2B7UiKaknTErgt|QZkjv12hcgvV2pDKttbobhWb|r9VVrZe5Pme8b6geazX2
PoC8|HBmIG7IX6rGlgIhTSjSSeWWc|lJ3YVvWsV7Yr4wdYNHaSefBW|mhEq3ordVk1GFMYoHLGB5HpJ|npisQ5FWxTNMaQvkG2OiO2Pnwej3rS6F5eXL|0|0|0
…….
……. 略去剩余部分

观察当前的log序列号
SYS@devdb1 >select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARCHIVED  STATUS     FIRST_CHANGE# FIRST_TIME          NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --------- ---------- ------------- ------------------- ------------ -------------------
         1          1        133    52428800        512          2 NO        CURRENT          1935103 2014/07/23 19:16:57   2.8147E+14
         2          1        132    52428800        512          2 NO        INACTIVE         1929910 2014/07/23 19:16:45      1935103 2014/07/23 19:16:57
         3          2         51    52428800        512          2 NO        CURRENT          1696072 2014/07/23 18:39:50   2.8147E+14
         4          2         50   52428800        512          2 NO        INACTIVE         1660718 2014/07/23 13:53:34      1696072 2014/07/23 18:39:50 


在oracle用户下执行sqlloader,参数silent=all 不输出屏幕信息

$ sqlldr userid=tpcc/tpcc control=c_stock.ctl data=c_stock_large.txt silent=all


查看日志输出
$ cat c_stock.log

SQL*Loader: Release 11.2.0.3.0 - Production on Wed Jul 23 19:31:46 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Control File:   c_stock.ctl
Data File:      c_stock_large.txt
Bad File:     c_stock_large.bad
Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array:     64 rows, maximum of 256000 bytes
Continuation:    none specified
Path used:      Conventional
Silent options: FEEDBACK, ERRORS and DISCARDS

Table TPCC.C_STOCK, loaded from every logical record.
Insert option in effect for this table: APPEND

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
S_I_ID                              FIRST     *   |       CHARACTER           
S_W_ID                               NEXT     *   |       CHARACTER           
S_QUANTITY                           NEXT     *   |       CHARACTER           
S_DIST_01                            NEXT     *   |       CHARACTER           
S_DIST_02                            NEXT     *   |       CHARACTER           
S_DIST_03                            NEXT     *   |       CHARACTER           
S_DIST_04                            NEXT     *   |       CHARACTER           
S_DIST_05                            NEXT     *   |       CHARACTER           
S_DIST_06                            NEXT     *   |       CHARACTER           
S_DIST_07                            NEXT     *   |       CHARACTER           
S_DIST_08                            NEXT     *   |       CHARACTER           
S_DIST_09                            NEXT     *   |       CHARACTER           
S_DIST_10                            NEXT     *   |       CHARACTER           
S_DATA                               NEXT     *   |       CHARACTER           
S_YTD                                NEXT     *   |       CHARACTER           
S_ORDER_CNT                          NEXT     *   |       CHARACTER           
S_REMOTE_CNT                         NEXT     *   |       CHARACTER           

value used for ROWS parameter changed from 64 to 58

Table TPCC.C_STOCK:
  3500000 Rows successfully loaded.       <==成功读取350万条记录
  0 Rows not loaded due to data errors.   <==没有错误
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                 254388 bytes(58 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:       3500000
Total logical records rejected:         0
Total logical records discarded:        0

Run began on Wed Jul 23 19:31:46 2014
Run ended on Wed Jul 23 19:36:11 2014

Elapsed time was:     00:04:24.27     <==执行时间4分钟24秒27
CPU time was:         00:00:50.02

查看执行sqlloader后log的变化,测试环境没有其他应用,可以认为log的变化全部来自于sqlloder加载数据产生。

SYS@devdb1 >select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARCHIVED  STATUS     FIRST_CHANGE# FIRST_TIME          NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --------- ---------- ------------- ------------------- ------------ -------------------
         1          1        159   52428800        512          2 NO        CURRENT          2153818 2014/07/23 19:36:03   2.8147E+14
         2          1        158   52428800        512          2 NO        INACTIVE         2148648 2014/07/23 19:35:53      2153818 2014/07/23 19:36:03
         3          2         51   52428800        512          2 NO        CURRENT          1696072 2014/07/23 18:39:50   2.8147E+14
         4          2         50   52428800        512          2 NO        INACTIVE         1660718 2014/07/23 13:53:34      1696072 2014/07/23 18:39:50

本机是rac双节点环境,sqlloader在节点一的服务器上执行,可以看出日志的序列从刚才的133,132变化到了159,158。 节点二的日志不发生变化。


下面开始优化sqlloader

一,优化:直接路径加载

SYS@devdb1 >truncate table tpcc.c_stock;     <==清理环境

$ sqlldr userid=tpcc/tpcc control=c_stock.ctl data=c_stock_large.txt direct=true

$ cat c_stock.log
…...
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation:    none specified
Path used:      Direct     <==直接路径加载
…..
  3500000 Rows successfully loaded.      <==成功读取350万条记录
…...
Run began on Wed Jul 23 19:54:40 2014
Run ended on Wed Jul 23 19:55:10 2014

Elapsed time was:     00:00:30.31    <==执行时间0分钟30秒31, 效果显著!
CPU time was:         00:00:14.94


SYS@devdb1 >select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARCHIVED  STATUS     FIRST_CHANGE# FIRST_TIME          NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --------- ---------- ------------- ------------------- ------------ -------------------
         1          1        159   52428800        512          2 NO        INACTIVE         2153818 2014/07/23 19:36:03      2161051 2014/07/23 19:54:58
         2          1        160   52428800        512          2 NO        CURRENT          2161051 2014/07/23 19:54:58   2.8147E+14
         3          2         51   52428800        512          2 NO        CURRENT          1696072 2014/07/23 18:39:50   2.8147E+14
         4          2         50   52428800        512          2 NO        INACTIVE         1660718 2014/07/23 13:53:34      1696072 2014/07/23 18:39:50

虽然直接加载的方式不经过sql引擎,不会产生redo和undo,但是我们看到还是会有微乎其微的日志增长。

二,优化:nologging
SYS@devdb1 >truncate table tpcc.c_stock;     <==清理环境

SYS@devdb1 >alter table tpcc.c_stock nologging;   <==更改表属性,强制不生成日志

$ cat c_stock.log
…..
Path used:      Direct
Run began on Wed Jul 23 20:07:54 2014
Run ended on Wed Jul 23 20:08:30 2014

Elapsed time was:     00:00:35.18   <==执行时间0分钟35秒18, 比刚才慢了将近5秒
CPU time was:         00:00:17.18

SYS@devdb1 >select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARCHIVED  STATUS     FIRST_CHANGE# FIRST_TIME          NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --------- ---------- ------------- ------------------- ------------ -------------------
         1          1        159   52428800        512          2 NO        INACTIVE         2153818 2014/07/23 19:36:03      2161051 2014/07/23 19:54:58
         2          1        160   52428800        512          2 NO        CURRENT          2161051 2014/07/23 19:54:58   2.8147E+14
         3          2         51   52428800        512          2 NO        CURRENT          1696072 2014/07/23 18:39:50   2.8147E+14
         4          2         50   52428800        512          2 NO        INACTIVE         1660718 2014/07/23 13:53:34      1696072 2014/07/23 18:39:50

日志序列已经不再变化了,可是并没有使速度变得更快,反而慢了5秒

三,优化:unrecoverable   
SYS@devdb1 >truncate table tpcc.c_stock;

unrecoverable  <==在控制文件头加入该参数,不再向数据库控制文件中写入SCN
load data
into table tpcc.c_stock
append
FIELDS TERMINATED BY '|'
(
s_i_id,
s_w_id,
s_quantity,
s_dist_01,
s_dist_02,
s_dist_03,
s_dist_04,
s_dist_05,
s_dist_06,
s_dist_07,
s_dist_08,
s_dist_09,
s_dist_10,
s_data,
s_ytd,
s_order_cnt,
s_remote_cnt



$ sqlldr userid=tpcc/tpcc control=c_stock.ctl data=c_stock_large.txt direct=true silent=all
$ cat c_stock.log
…..
Path used:      Direct
,,,
  3500000 Rows successfully loaded.
 ….
Run began on Wed Jul 23 20:17:57 2014
Run ended on Wed Jul 23 20:18:29 2014

Elapsed time was:     00:00:32.90   <==执行时间0分钟32秒90
CPU time was:         00:00:16.33
效果平平,没有什么变化


四,优化 — parallel
SYS@devdb1 >truncate table tpcc.c_stock;

$ sqlldr userid=tpcc/tpcc control=c_stock.ctl data=c_stock_large.txt direct=true parallel=true
cat c_stock.log
…..
Path used:      Direct - with parallel option.   <==直接路径加载,并行全开启了
…..
Run began on Wed Jul 23 20:23:36 2014
Run ended on Wed Jul 23 20:24:12 2014
...
Elapsed time was:     00:00:35.53   <==执行时间0分钟35秒53, 最慢的一次。
CPU time was:         00:00:19.03


实验总结:
0,如果是测试环境做数据加载,一定要先关闭归档日志(alter database no archivelog)。如果归档日志不关闭,后面的所有优化都没有效果。
1, sqlloder的direct 加载方式效果最突出。 
2,设置表为nologing虽然可以彻底避免日志的生成,但是优化的效果已经不明显了。
3,parallel需要在同时执行多个控制文件,访问多个数据文件是才会有效果。例如:
sqlldr USERID=scott CONTROL=load1.ctl DIRECT=TRUE PARALLEL=true
sqlldr USERID=scott CONTROL=load2.ctl DIRECT=TRUE PARALLEL=true
sqlldr USERID=scott CONTROL=load3.ctl DIRECT=TRUE PARALLEL=true

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29047826/viewspace-1225734/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/29047826/viewspace-1225734/

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值