上面的三篇文章中,介绍了SQL*Loader的使用方法及使用案例,本篇将根据实际的使用案例来更深层次的来解读SQL*Loader
Oracle SQL*Loader使用案例(一)
Oracle SQL*Loader使用案例(二)
Oracle SQL*Loader使用案例(三)
目录
SQL*Loader对不同文件及格式的处理方法
4.1大字段(LOB类型)的导入
LOB作为大字段数据类型,是Oracle新增的数据类型,用来替代long和long raw类型,一般sqlldr操作中不会涉及大字段类型的操作,如果你遇到了这种需求,一般分以下两种情况处理:
4.1.1数据保存在数据文件中
以Manager表为例,修改Remark字段为lob类型
SCOTT@seiang11g>alter table tb_manager drop column remark;
Table altered.
SCOTT@seiang11g>alter table tb_manager add remark clob;
Table altered.
创建数据文件和控制文件如下:
--数据文件
[oracle@wjq SQL*Loader]$ vim wjq_test12_1.dat
10,SMITH,SALES MANAGER,This is SMITH.
He is a Sales Manager.|
11,ALLEN.W,TECH MANAGER,This is ALLEN.W.
He is a Tech Manager.|
16, BLAKE, HK MANAGER,"This is BLAKE.
He is a Hr Manager. The jobs responsibilities are in the following:
1. Ensure the effective local implementation of corporate level HR initiatives and new programs.
2. Take initiatives in defining HR strategy on attracting, hiring, integrating, developing, managing......
3. Oversee standard recruiting and procedures to ensure the conpany's staffing requirements ......
4. Provide employees with fair and appropriate compensation and benefit, to ensure market competitiveness.
5. Develop, implement and oversee the training and development programs to upgrade the skills......"|
--控制文件
[oracle@wjq SQL*Loader]$ vim wjq_test12_1.ctl
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test12_1.dat' "str '|\n'"
TRUNCATE INTO TABLE tb_manager
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
(MGRNO,MNAME,JOB,REMARK char(100000))
注意这里REMARK显式指定char(100000),因为Oracle默认所有输入字段都是char(255),如不显式指定类型和长度,一旦加载列的实际长度超出255,则数据加载时自动将该行忽略,并在对应的log日志就会报错:Field in data file exceeds maximum length.
执行sqlldr命令,并查看结果
--执行sqlldr命令
[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test12_1.ctl
SQL*Loader: Release 11.2.0.4.0 - Production on Wed Nov 1 11:27:16 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 2
Commit point reached - logical record count 3
--查看结果
SCOTT@seiang11g>select * from tb_manager;
MGRNO MNAME JOB REMARK
---------- -------------------- ------------------------------ ----------------------------------------
10 SMITH SALES MANAGER This is SMITH.
He is a Sales Manager.
11 ALLEN.W TECH MANAGER This is ALLEN.W.
He is a Tech Manager.
16 BLAKE HK MANAGER This is BLAKE.
He is a Hr Manager. The jobs responsibil
ities are in the followin
数据成功加载。这种方式最关键的地方是必须保证REMARK列定义的长度大于数据文件中文本块的大小
4.1.2数据保存在独立的文件中
这种数据相对于第一种更加常见,相应处理也更简单一些(跳过了换行符的处理),sqlldr中提供了LOBFILE关键字,直接支持加载文件到LOB类型中,这实在是帮了我们大忙,下面演示一下其用法。
首先在数据库,创建一个新表:
SCOTT@seiang11g>create table tb_lob(fileowner varchar2(30),filename varchar2(200),filesize number,filedata clob,create_date date);
Table created.
表中共有5列,分别表示文件属主、文件名、文件大小、文件内容和文件创建时间。建数据文件,内容如下:
--数据文件
[oracle@wjq SQL*Loader]$ vim wjq_test12_2.dat
oracle 208 2017-10-31 16:11 wjq_test10.ctl
oracle 187 2017-11-01 16:18 wjq_test11_1.dat
oracle 1801 2017-11-01 16:21 wjq_test11_1.log
oracle 68 2017-11-01 10:35 wjq_test11_2.bad
oracle 1784 2017-11-01 10:40 wjq_test11_4.log
oracle 204 2017-11-01 11:30 wjq_test12_1.ctl
oracle 749 2017-11-01 11:27 wjq_test12_1.dat
oracle 1870 2017-11-01 11:29 wjq_test12_1.log
oracle 180 2017-10-31 14:41 wjq_test1.ctl
oracle 1648 2017-10-31 14:43 wjq_test1.log
oracle 163 2017-10-31 14:54 wjq_test2.ctl
oracle 82 2017-10-31 14:51 wjq_test2.dat
oracle 1650 2017-10-31 14:56 wjq_test2.log
oracle 170 2017-10-31 15:03 wjq_test3.ctl
oracle 113 2017-10-31 15:01 wjq_test3.dat
oracle 1650 2017-10-31 15:04 wjq_test3.log
oracle 186 2017-10-31 15:08 wjq_test4.ctl
oracle 1762 2017-10-31 15:08 wjq_test4.log
oracle 201 2017-10-31 15:11 wjq_test5.ctl
oracle 1777 2017-10-31 15:12 wjq_test5.log
oracle 201 2017-10-31 15:24 wjq_test6.ctl
oracle 546 2017-10-31 15:16 wjq_test6.dat
oracle 1748 2017-10-31 15:32 wjq_test6.log
oracle 155 2017-10-31 15:32 wjq_test7.ctl
oracle 404 2017-10-31 15:28 wjq_test7.dat
oracle 1748 2017-10-31 15:32 wjq_test7.log
oracle 93 2017-10-31 15:42 wjq_test8_1.dat
oracle 75 2017-10-31 15:43 wjq_test8_2.dat
oracle 20 2017-10-31 15:43 wjq_test8_3.dat
oracle 254 2017-10-31 15:45 wjq_test8.ctl
oracle 2029 2017-10-31 15:45 wjq_test8.log
oracle 425 2017-10-31 15:55 wjq_test9.ctl
oracle 354 2017-10-31 15:51 wjq_test9.dat
oracle 139 2017-10-31 15:55 wjq_test9.dsc
oracle 2766 2017-10-31 15:55 wjq_test9.log
--控制文件
[oracle@wjq SQL*Loader]$ vim wjq_test12_2.ctl
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test12_2.dat'
TRUNCATE INTO TABLE tb_lob
(
FILEOWNER position(1:6),
FILESIZE position(*+1:11),
CREATE_DATE position(*+1:28) date 'yyyy-mm-dd hh24:mi',
FILENAME position(*+1:45) "substr(:FILENAME,instr(:FILENAME,'/',-1)+1)",
FILEDATA LOBFILE(FILENAME) TERMINATED BY EOF
)
这个控制文件是之前介绍示例应用的集大成者,又有定长处理,又有函数转换,唯一陌生的就是最后一行:LOBFILE(FILENAME)TERMINATED BY EOF,这就是前面提到的LOBFILE 关键字,只需要指定FILENAME列,其他都是固定格式,调用时直接按此指定即可。
执行sqlldr命令,并查看结果
--执行sqlldr命令
[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test12_2.ctl
SQL*Loader: Release 11.2.0.4.0 - Production on Wed Nov 1 12:06:28 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 47
--查看结果
SCOTT@seiang11g>select * from tb_lob;
FILEOWNER FILENAME FILESIZE FILEDATA CREATE_DA
--------------- -------------------- ---------- -------------------------------------------------- ---------
oracle wjq_test11_2.log 1782 01-NOV-17
SQL*Loader: Release 11.2.0.4.0 - Production on Wed
Nov 1 10:37:45 2017
Copyrig
oracle wjq_test11_3.log 1781 01-NOV-17
SQL*Loader: Release 11.2.0.4.0 - Production on Wed
Nov 1 10:40:46 2017
Copyrig
oracle wjq_test11_4.dat 182 10,SMITH,SALES MANAGER,This is SMITH. 01-NOV-17
He is a Sales Manager.|
11,ALLEN.W,TECH MA
oracle wjq_test10.ctl 208 LOAD DATA 31-OCT-17
INFILE '/u01/app/oracle/SQL*Loader/wjq_test10.dat'
TRUNCATE INTO T
oracle wjq_test10.dat 628 #This is data of emp 31-OCT-17
ENAME MGR JOB SAL
---------- ----- ---
oracle wjq_test11_1.ctl 206 LOAD DATA 01-NOV-17
INFILE '/u01/app/oracle/SQL*Loader/wjq_test11_1.da
t'
TRUNCATE INTO
oracle wjq_test11_1.dat 187 10,SMITH,SALES MANAGER,This is SMITH.\nHe is a Sal 01-NOV-17
es Manager.
11,ALLEN.W,TECH
……(此处省略N行)
因为篇幅原因不全部显示,这里注意的一点是在dat文件中指定的路径下必须存在相应的文件,即/home/oracle/sqlldr/下存在dat中指定的文件,否则加载会报错
4.2字段无值导致加载报错
在大多数情况下,项目在实际的实施过程中,能够顺利执行的机率通常都是与项目的复杂程度成反比,越是复杂的需求,实际实施过程中出现问题的机率就越高,因此在真正实施前,是否能够充分考虑到意外出现的情况,也是考验实施者技术实力的一个重要方面。
SQL*Loader工具的应用非常简单,前面的多项示例能够说明这一点,不过这不代表执 行SQL*Loader就不会遇到错误,毕竟大多数情况下SQL*Loader中极重要的一环:数据文件的生成过程并不由你掌控,因此编写的控制文件是否能够适应数据文件中数据的复杂多样性,就是对DBA技术实力的一项综合考验了。比如某天你拿到了一个这样的数据文件:
--数据文件
[oracle@wjq SQL*Loader]$ vim wjq_test13.dat
SMITH,CLEAR,3904
ALLEN,SALESMAN,
WARD,SALESMAN,3128
KING,PRESIDENT,2523
看起来和前面的某个例子很相似,根据此数据文件创建控制文件如下:
--控制文件
[oracle@wjq SQL*Loader]$ vim wjq_test13.ctl
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test13.dat'
TRUNCATE INTO TABLE tb_loader
FIELDS TERMINATED BY ","
(ENAME, JOB, SAL)
执行sqlldr命令,并查看结果
--执行sqlldr命令
[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_test13.ctl
SQL*Loader: Release 11.2.0.4.0 - Production on Wed Nov 1 13:47:42 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 4
--查看结果
SCOTT@seiang11g>select * from tb_loader;
ENAME JOB SAL COMM
---------- ------------------------------ ---------- ----------
SMITH CLEAR 3904
WARD SALESMAN 3128
KING PRESIDENT 2523
很奇怪的发现明明4条记录却只导入了3条,通过log日志我们发现有如下提示:
Record 2: Rejected - Error on table TB_LOADER, column SAL.
Column not found before end of logical record (use TRAILING NULLCOLS)
Table TB_LOADER:
3 Rows successfully loaded.
1 Row not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
在此想首先阐述一个观点:出现错误虽然不是什么好消息,但是错误本身并不可怕,最可怕的恰恰是没有错误,程序看起来执行得好好的,四处査看也没有提 示信息,但就是得不到想要的正确结果,这才是最头痛的,你想处理都无从着手。
就本例中的错误信息来说,sqlldr提示己经非常清楚:直到行结束也没发现适当的列值。这是因为本例中数据文件的第2行没有提供适当的值(这一点都不稀奇,不管数据 量庞大与否,DBA绝对不能期望数据文件完全满足要求,因此在编写控制文件时,也要考虑到对意外情况的处理)
针对这一错误,sqlldr甚至连解决方案也一并提供:使用TRAILING NULLCOLS。TRAILING NULLCOLS的作用是当某行没有对应的列值时,sqlldr就会自动将其值陚为NULL,而不是报错。
接下来我们尝试修改控制文件,增加的正是sqlldr的日志文件中提示的,文件修改如下所示:
[oracle@wjq SQL*Loader]$ vim wjq_test13.ctl
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_test13.dat'
TRUNCATE INTO TABLE tb_loader
FIELDS TERMINATED BY "," TRAILING NULLCOLS
(ENAME, JOB, SAL)
这时候执行sqlldr命令时,结果就正常了
SCOTT@seiang11g>select * from tb_loader;
ENAME JOB SAL COMM
---------- --------- ---------- ----------
SMITH CLEAR 3904
ALLEN SALESMAN
WARD SALESMAN 3128
KING PRESIDENT 2523
通过这个例子,我们可以得到如下结论:
(1)执行完操作后一定要验证。就本例来说,从sqlldr命令的执行来看一切正常,如果不是到SQL*Plus环境中査看导入的数据,恐怕都不知道有记录未被成功导入。
(2)—定要注意看日志,sqlldr虽然算不上智能,但是也不傻,有时候造成错误 的原因只是它不知道怎么办好,不过日志文件中一定会留下处理痕迹,不管sqlldr命令执行是否成功,日志文件总是能告诉我们其执行的更多细节。
4.3百万级记录数据的加载
前面己经讲了很多示例,但数据量都较小,只能描述功能,实战参考意义不大,因为在实际工作中应用sqlldr执行加载,多数情况数据量都达到一定级别,因此这里 构建了一个百万记录级的加载,看看实际加载效率如何,以及如何进行加载优化。
本次演示的第一小节也是数据UNLOAD的过程,只稍加修改即可以保存为生成.csv 格式文件的脚本,希望也能对你有帮助。
4.3.1生成百万数据文件
要用到的sql代码如下:
--第一条sql
[oracle@wjq SQL*Loader]$ cat getobject.sql
select a.owner||',"'||a.object_name||'",'||a.object_id||','||to_char(a.created,'yyyy-mm-dd hh24:mi:ss')|| ','||a.status from dba_objects a,(select rownum rn from dual connect by rownum<=20) b;
--第二条sql
[oracle@wjq SQL*Loader]$ cat exe.sql
set echo off
set term off
set line 100 pages 0
set feedback off
set heading off
spool /u01/app/oracle/SQL*Loader/wjq_object.csv
@/u01/app/oracle/SQL*Loader/getobject.sql
spool off
set heading on
set feedback on
set term on
set echo on
登陆到SQL*Plus中执行exe.sql
SCOTT@seiang11g>@/u01/app/oracle/SQL*Loader/exe.sql
然后用vim打开生成的wjq_object.csv,执行:%s/\s\+$//来消除行尾的空格,生成的csv文件107M左右
[oracle@wjq SQL*Loader]$ du -sh wjq_object.csv
107M wjq_object.csv
4.3.2初始化环境
--创建演示表
SCOTT@seiang11g>create table tb_objects(owner varchar2(30),object_name varchar2(50),object_id number,status varchar2(10),created date);
Table created.
--创建索引
SCOTT@seiang11g>create index idx_wjq_obj_owner_name on tb_objects(owner,object_name);
Index created.
4.3.3执行导入
创建控制文件
--控制文件
[oracle@wjq SQL*Loader]$ vim wjq_objects.ctl
LOAD DATA
INFILE '/u01/app/oracle/SQL*Loader/wjq_object.csv
TRUNCATE INTO TABLE tb_objects
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
(
owner,
object_name,
object_id,
created date 'yyyy-mm-dd hh24:mi:ss',
status
)
这里注意,对于CREATED列,我们指定了日期格式,并进行了转换,这个格式一定要与数据文件中日期格式相符,不然日期格式转换时会报错并导致数据加载失败。
按照默认参数执行sqlldr,看看需要多长时间,同时指定ERRORS参数值为10,明确指定出现10次错误即中止加载。
--执行sqlldr命令
[oracle@wjq SQL*Loader]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_objects.ctl errors=10
SQL*Loader: Release 11.2.0.4.0 - Production on Wed Nov 1 17:44:37 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 64
Commit point reached - logical record count 128
Commit point reached - logical record count 192
Commit point reached - logical record count 256
……(吃出省略N行记录)
Commit point reached - logical record count 1740073
Commit point reached - logical record count 1740137
Commit point reached - logical record count 1740160
等待加载完成后,我们观察对应的日志文件:
[oracle@wjq SQL*Loader]$ cat wjq_objects.log
SQL*Loader: Release 11.2.0.4.0 - Production on Wed Nov 1 17:44:37 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Control File: /u01/app/oracle/SQL*Loader/wjq_objects.ctl
Data File: /u01/app/oracle/SQL*Loader/wjq_object.csv
Bad File: /u01/app/oracle/SQL*Loader/wjq_object.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 10
Bind array: 64 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Table TB_OBJECTS, loaded from every logical record.
Insert option in effect for this table: TRUNCATE
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
OWNER FIRST * , O(") CHARACTER
OBJECT_NAME NEXT * , O(") CHARACTER
OBJECT_ID NEXT * , O(") CHARACTER
CREATED NEXT * , O(") DATE yyyy-mm-dd hh24:mi:ss
STATUS NEXT * , O(") CHARACTER
Table TB_OBJECTS:
1740160 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 82560 bytes(64 rows)
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1740160
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Wed Nov 01 17:44:37 2017
Run ended on Wed Nov 01 17:46:42 2017
Elapsed time was: 00:02:04.73
CPU time was: 00:00:05.14
日志文件中得知,1740160条数据全部成功导入,没有失败数据,共用时2分钟左右…(因为用的是虚拟机测试,性能可能稍微差一点)
4.3.4导入提速
sqlldr常规路径导入时默认一次加载64行,现在要加载的总行数已经达到百万级,十位数显然太小,我们首先尝试修改该值,先直接在后面加个0好了,看看能对效率起到多大的提升:
log日志中第一句value used for ROWS parameter changed from 640 to 198,该信息是提示由于640行所占用的空间己经超出了参数BINDSIZE的默认值,因此自动修改到最大可承受的198行,这说明BINDSIZE参数默认值偏小,不过即使是这样,我们看到实际执行时间也被缩短到了5分钟,提高了一倍以上, 再进一步调整BINDSIZE参数值,默认为256K,我们将其修改为10M(1024KB*1024*10=10485760),同时将一次加载的行数提高到5000。
--执行sqlldr命令
[oracle@wjq ~]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_objects.ctl errors=10 rows=5000 bindsize=10485760
SQL*Loader: Release 11.2.0.4.0 - Production on Thu Nov 2 09:11:55 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
specified value for readsize(1048576) less than bindsize(10485760)
Commit point reached - logical record count 5000
Commit point reached - logical record count 10000
Commit point reached - logical record count 15000
Commit point reached - logical record count 20000
(……此处省略N行记录)
Commit point reached - logical record count 1734196
Commit point reached - logical record count 1739196
Commit point reached - logical record count 1740160
加载数据完成后,再次查看日志文件,日志信息如下:
Table TB_OBJECTS:
1740160 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 6450000 bytes(5000 rows)
Read buffer bytes:10485760
Total logical records skipped: 0
Total logical records read: 1740160
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Thu Nov 02 09:19:27 2017
Run ended on Thu Nov 02 09:20:53 2017
Elapsed time was: 00:01:26.38
CPU time was: 00:00:04.19
由日志信息可以看到,时间被缩短到1分钟半多,几乎提高了一倍,而此时绑定数组才占用了6MB左右的空间,ROWS的参数值还可以继续提高。不过因为我们这里记录量和数 据量都稍小,继续再提高这两个参数的值,效率提升也不明显了(仅针对这百万记录量的 导入而言,如果是千万级数据量加载,缩短的时间应该还是很明显的)。
4.3.5使用Direct参数,让速度更快一点
前面的参数都是基于常规路径加载,下面通过直接路径加载,所有参数默认,只打开直接路径加载参数:
--执行sqlldr命令
[oracle@wjq ~]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_objects.ctl direct=true log=/u01/app/oracle/SQL*Loader/wjq_objects.log
SQL*Loader: Release 11.2.0.4.0 - Production on Thu Nov 2 09:31:04 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 1740160.
加载完成后,打开log日志,查看节选的日志信息如下:
[oracle@wjq SQL*Loader]$ cat wjq_objects.log
SQL*Loader: Release 11.2.0.4.0 - Production on Thu Nov 2 09:31:04 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Control File: /u01/app/oracle/SQL*Loader/wjq_objects.ctl
Data File: /u01/app/oracle/SQL*Loader/wjq_object.csv
Bad File: /u01/app/oracle/SQL*Loader/wjq_object.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation: none specified
Path used: Direct
Table TB_OBJECTS, loaded from every logical record.
Insert option in effect for this table: TRUNCATE
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
OWNER FIRST * , O(") CHARACTER
OBJECT_NAME NEXT * , O(") CHARACTER
OBJECT_ID NEXT * , O(") CHARACTER
CREATED NEXT * , O(") DATE yyyy-mm-dd hh24:mi:ss
STATUS NEXT * , O(") CHARACTER
The following index(es) on table TB_OBJECTS were processed:
index SCOTT.IDX_WJQ_OBJ_OWNER_NAME loaded successfully with 1740160 keys
Table TB_OBJECTS:
1740160 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Date cache:
Max Size: 1000
Entries : 937
Hits : 1739223
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1740160
Total logical records rejected: 0
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 427
Total stream buffers loaded by SQL*Loader load thread: 262
Run began on Thu Nov 02 09:31:04 2017
Run ended on Thu Nov 02 09:31:21 2017
Elapsed time was: 00:00:16.16
CPU time was: 00:00:01.60
发现加载时间已经降到16s的时间,性能还是相当可以的
4.3.6有没有可能更快
直接路径导入可用的参数也有不少,不过我们这里总数据量不大,因此实际能够起到效率提升的不多,我准备主要从以下两个参数着手:
①STREAMSIZE:直接路径加载默认读取全部记录,因此不需要设置ROWS参数,读取到的数据处理后存入流缓存区,即STREAMSIZE参数。该参数默认值为256KB,这里加大到10MB。
②DATE_CACHE:该参数指定一个转换后日期格式的缓存区,以条为单位,默认值1000条,即保存1000条转换后的日期格式,由于我们要导入的数据中有日期列, 因此加大该参数值到3000,以降低日期转换操作带来的开销。
修改参数后执行命令最终形式如下所示:
--执行sqlldr命令
[oracle@wjq ~]$ sqlldr scott/tiger control=/u01/app/oracle/SQL*Loader/wjq_objects.ctl log=/u01/app/oracle/SQL*Loader/wjq_objects.log direct=true streamsize=10485760 date_cache=3000
SQL*Loader: Release 11.2.0.4.0 - Production on Thu Nov 2 09:37:56 2017
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 1740160.
加载完成后,再次查看命令,节选日志信息如下:
The following index(es) on table TB_OBJECTS were processed:
index SCOTT.IDX_WJQ_OBJ_OWNER_NAME loaded successfully with 1740160 keys
Table TB_OBJECTS:
1740160 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Date cache:
Max Size: 3000
Entries : 937
Hits : 1739223
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes:10485760
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1740160
Total logical records rejected: 0
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 427
Total stream buffers loaded by SQL*Loader load thread: 0
Run began on Thu Nov 02 09:37:56 2017
Run ended on Thu Nov 02 09:38:09 2017
Elapsed time was: 00:00:13.78
CPU time was: 00:00:01.60
170万条的数据加载的时间大概在13秒左右,考虑到测试环境只是一台低配的虚拟机,这个效率也已经相当快了。
再引申说几句,关于优化涉及层面太多,并非单单sqlldr调整好,效率就一定最高。上述演示建立在假设数据库层面己经最优的情况下,通过合理配置sqlldr的参数来提高 加载效率,但是不是能够更快?我觉着回答是肯定的,优化并不是简单地设置一个FAST=TRUE的参数,而是一个综合考量下的结果。举个例子,在前面例子中的控制文件 基本都没有指定数据类型,这样可能会导致产生隐式的类型转换(也影响效率),如果全部显式指定数据类型,并且改成定长格式,导入效率还能得到一定提升。再比如说上 述表中还创建了索引,如果单纯希望数据加载效率提高,只需将表中的索引Disable,效率立刻又能提高一个数量级,但是Disable索引和约束是否符合你的业务需求,这就得看你的 实际情况了。如果你理解得足够深刻,就会发现所谓的调优,不过是将各种因素摆在一起,取一个中间值,保持相互平衡罢了。
小结
1.迁移、备份和恢复数据的又一个有效手段
2.不同数据库之间进行数据迁移的非常方便而且通用的工具,避免类似EXP(EXPDP)/IMP(IMPDP)工具导致乱码问题;
3.从文本文件向数据库迁移的超级有效的手段;
4.速度快,尤其结合使用直接路径加载技术,这个技术可以跳过整个SQL引擎,同时避免undo和redo的生成,有效的提高数据的加载效率;
5.与外部表技术结合紧密。
算这一篇,总共四篇关于SQL*Loader的使用案例,下面关于使用SQL*Loader导入CLOB和BLOB数据使用案例将在下一篇博文中进行分享,请大家多多关注!
作者:SEian.G(苦练七十二变,笑对八十一难)
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31015730/viewspace-2147215/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/31015730/viewspace-2147215/