外部表使用浅析

--外部表浅析


外部表较之SQLLDR几个关键特性:
可以使用复杂的WHERE条件有选择地加载数据。。
能够合并(merge)数据。可以取一个填满数据的操作系统文件,并由它更新现有的数据库记录。
能执行高效的关联查询。可以将一个外部表联结到另一个数据库表作为加载过程的一部分。
可以通过CREATE TABLE或INSERT语句中的ORDER BY子句来加载排序好的数据。
外部表使用INSERT语句时,多表插入会更方便一些。
可以在查询外部表数据之前做预处理,比如执行一个或多个操作系统命令。
对于开发新手来说,外部表学习线路更短。


一.建立外部表


~ 在建表语句中把EXTERNAL_TABLE参数设定为SQLLDR;
~ 从12c开始,可以使用模式来运行SQLLDR。


1.通过EXTERNAL_TABLE参数执行SQLLDR

[oracle@ocm1 ~]$ cat dept.ctl   --SQLLDR的控制文件
LOAD DATA
INFILE *
INTO TABLE DEPT
FIELDS TERMINATED BY ','
(DEPTNO, DNAME, LOC )
BEGINDATA
10,Sales,Virginia
20,Accounting,Virginia
30,Consulting,Virginia
40,Finance,Virginia

EODA@PROD1> create table dept as select * from scott.dept;

Table created.

[oracle@ocm1 ~]$ sqlldr eoda/foo dept.ctl external_table=generate_only  --生成外部表定义,external_table=generate_only使得SQLLDR并不具体加载任何数据,只是生成SQL语句放在日志文件中。

SQL*Loader: Release 11.2.0.3.0 - Production on Sat Dec 3 13:22:29 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

[oracle@ocm1 ~]$ cat dept.log 

SQL*Loader: Release 11.2.0.3.0 - Production on Sat Dec 3 13:22:29 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Control File:   dept.ctl
Data File:      dept.ctl
  Bad File:     dept.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation:    none specified
Path used:      External Table

Table DEPT, loaded from every logical record.
Insert option in effect for this table: INSERT

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
DEPTNO                              FIRST     *   ,       CHARACTER            
DNAME                                NEXT     *   ,       CHARACTER            
LOC                                  NEXT     *   ,       CHARACTER            



CREATE DIRECTORY statements needed for files   --创建工作目录
------------------------------------------------------------------------
CREATE DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000 AS '/home/oracle'  


CREATE TABLE statement for external table:
------------------------------------------------------------------------
CREATE TABLE "SYS_SQLLDR_X_EXT_DEPT" 
(
  "DEPTNO" NUMBER(2),
  "DNAME" VARCHAR2(14),
  "LOC" VARCHAR2(13)
)
ORGANIZATION external --外部表开始定义
(
  TYPE oracle_loader
  DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000
  ACCESS PARAMETERS   --在此告诉数据库如何处理这个输入文件
  (
    RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII   --说明一行数据默认以换行符结尾
    BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'dept.bad'    --在创建目录中建立一个异常数据文件
    LOGFILE 'dept.log_xt'   --在目录中建立一个日志文件
    READSIZE 1048576  --Oracle读取输入数据文件中所用的默认缓冲区,这个例子中是1MB。如果采用专用服务器模式内存来自PGA。
    SKIP 6   --确定应该跳过输入文件中的多少记录,此处因为dept.ctl中数据前有6行记录
    FIELDS TERMINATED BY "," LDRTRIM   --指定逗号分割数据,LDRTRIM是默认的截断模式
    REJECT ROWS WITH ALL NULL FIELDS   --这导致外部表会在坏文件中记录所有全空的行,而且不加载。
    (
      "DEPTNO" CHAR(255)
        TERMINATED BY ",",
      "DNAME" CHAR(255)
        TERMINATED BY ",",
      "LOC" CHAR(255)
        TERMINATED BY ","
    )
  )
  location  --告诉oracle所加载的文件名
  (
    'dept.ctl'
  )
)REJECT LIMIT UNLIMITED


INSERT statements used to load internal tables:  
------------------------------------------------------------------------
INSERT /*+ append */ INTO DEPT 
(
  DEPTNO,
  DNAME,
  LOC
)
SELECT 
  "DEPTNO",
  "DNAME",
  "LOC"
FROM "SYS_SQLLDR_X_EXT_DEPT"


statements to cleanup objects created by previous statements:
------------------------------------------------------------------------
DROP TABLE "SYS_SQLLDR_X_EXT_DEPT"
DROP DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000



Run began on Sat Dec 03 13:22:29 2016
Run ended on Sat Dec 03 13:22:29 2016

Elapsed time was:     00:00:00.33
CPU time was:         00:00:00.01

2.采用Express模式来运行SQLLDR
从Oracle12c开始,SQLLDR express模式可以快速地把CSV文件数据加载到表中。
如果你是用的数据库SCHEMA具有CREATE ANY DIRECTORY权限,那么express模式支持你采用外部表加载数据。
否则,SQLLDR会采用DIRECORY PATH直接路径加载的模式。

--实验
SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE	12.1.0.2.0	Production
TNS for Linux: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

[oracle@luo ~]$ cat dept.dat --假设有一个dept.dat的CSV文件
10,Sales,Virginia
20,Accounting,Virginia
30,Consulting,Virginia
40,Finance,Virginia

[oracle@luo ~]$ rlwrap sqlplus eoda/foo@pdbwyzc  --有create any directory权限

SQL*Plus: Release 12.1.0.2.0 Production on Sat Dec 3 14:39:49 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Sat Dec 03 2016 14:38:36 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> create table dept( deptno  number(2) constraint dept_pk primary key, dname   varchar2(14), loc     varchar2(13));

Table created.

[oracle@luo ~]$ sqlldr eoda@pdbwyzc table=dept  
Password:

SQL*Loader: Release 12.1.0.2.0 - Production on Sat Dec 3 14:40:42 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: DEPT
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table DEPT:
  4 Rows successfully loaded.

Check the log files:
  dept.log
  dept_%p.log_xt
for more information about the load.

二.错误分析

SQL> truncate table dept;  --清楚表数据

Table truncated.

[oracle@luo ~]$ cat dept.dat  --更改数据源加一行不可添加的数据
10,Sales,Virginia
20,Accounting,Virginia
30,Consulting,Virginia
40,Finance,Virginia
c,C,C,C

[oracle@luo ~]$ sqlldr eoda@pdbwyzc table=dept
Password:

SQL*Loader: Release 12.1.0.2.0 - Production on Sat Dec 3 15:09:27 2016

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: DEPT
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table DEPT:
  4 Rows successfully loaded.

Check the log files:
  dept.log
  dept_%p.log_xt
for more information about the load.
[oracle@luo ~]$ cat dept
dept_6833.bad     dept_6833.log_xt  dept.dat          dept.log

[oracle@luo ~]$ cat dept_6833.bad   --bad文件记录未插入的数据

c,C,C,C

[oracle@luo ~]$ cat dept_6833.log_xt  --查看报错信息


 LOG file opened at 12/03/16 15:09:29

KUP-05004:   Warning: Intra source concurrency disabled because parallel select was not requested.

Field Definitions for table SYS_SQLLDR_X_EXT_DEPT
  Record format DELIMITED BY NEWLINE
  Data in file has same endianness as the platform
  Reject rows with all null fields

  Fields in Data Source: 

    DEPTNO                          CHAR (255)
      Terminated by ","
      Trim whitespace from left and right
    DNAME                           CHAR (255)
      Terminated by ","
      Trim whitespace from left and right
    LOC                             CHAR (255)
      Terminated by ","
      Trim whitespace from left and right
KUP-04021: field formatting error for field DEPTNO
KUP-04023: field start is after end of record
KUP-04101: record 6 rejected in file /home/oracle/dept.dat
error processing column DEPTNO in row 5 for datafile /home/oracle/dept.dat
ORA-01722: invalid number

--使用以下脚本可以计算出有多少记录没有被插入
[oracle@luo ~]$ cat d_bad.sql 
drop table et_bad;

create table et_bad
  ( text1 varchar2(4000) ,
    text2 varchar2(4000) ,
    text3 varchar2(4000)
  )
  organization external
  (type oracle_loader
   default directory SYS_SQLLDR_XT_TMPDIR_00000
   access parameters
   (
     records delimited by newline
     fields
     missing field values are null
     ( text1 position(1:4000),
       text2 position(4001:8000),
       text3 position(8001:12000)
     )
   )
   location ('dept_6833.bad')
  )
/

三.预处理
预处理是一个外部表的特性,允许你在从外部表查询数据之前执行一个或多个操作系统命令。
可以通过PREPROCESSOR子句来调用预处理,可以在该子句中调用一个操作系统命令或shell脚本。


使用场景:
通过查询外部表,动态回显一个操作系统命令的输出;
在显示数据前查找文件并且进行行列过滤;
在返回数据前处理和修改文件内容。

--参考来源《Oracle编程艺术深入理解数据库体系结构(第三版)》

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值