The ORACLE_LOADER Access Driver

本文详细介绍了 Oracle 数据库中使用 ORACLE_LOADER 访问驱动器加载外部表的语法和参数,包括固定长度、可变长度、由分隔符界定的记录格式,以及日期格式、字符集、预处理、错误处理等配置。内容涵盖了记录定义、字段定义、数据类型转换等多个方面,为高效处理大量数据提供了指导。
摘要由CSDN通过智能技术生成

  1. access_parameters Clause


1) Comments:Comments are lines that begin with two hyphens followed by text. Comments must be placed before any access parameters

--This is a comment.

  1. record_format_info
  2. field_definitions
  3. column_transforms

  1. record_format_info Clause

The record_format_info clause is optional. The default value, whether the clause is specified or not, is RECORDS DELIMITED BY NEWLINE.


et_record_spec_options语法:

et_output_files语法如下:

2.1 FIXED length

The FIXED clause is used to identify the records as all having a fixed size of length bytes. 指定的值要算上行分隔符的大小The size specified for FIXED records must include any record termination characters, such as newlines.

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (

RECORDS FIXED 20 FIELDS 

(first_name CHAR(7),

last_name CHAR(8),

year_of_birth CHAR(4)))

LOCATION ('info.dat'));

15.2.2 VARIABLE size

同上你设置variable的值时也要包含记录中的分隔符,单位同样为byte

The count at the beginning of the record must include any record termination characters,

but it does not include the size of the count field itself.

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (

RECORDS VARIABLE 2 FIELDS TERMINATED BY ','

                                   (first_name CHAR(7),

                                   last_name CHAR(8),

                                   year_of_birth CHAR(4)))

                         LOCATION ('info.dat'));

数据文件:

21Alvin,Tolliver,1976,

19Kenneth,Baer,1963,

16Mary,Dube,1973,

15.2.3 DELIMITED BY

If DELIMITED BY NEWLINE is specified, then the actual value used is platform-specific. On UNIX platforms, NEWLINE is assumed to be "\n". On Windows operating systems, NEWLINE is assumed to be "\r\n".

If DELIMITED BY string is specified, then string can be either text or a series of hexadecimal digits enclosed within quotation marks and prefixed by OX or X. If it is text, then the text is converted to the character set of the data file and the result is used for identifying record boundaries.

If the following conditions are true, then you must use hexadecimal digits to identify the delimiter:

  1. The character set of the access parameters is different from the character set of the data file.
  2. Some characters in the delimiter string cannot be translated into the character set of the data file.

The hexadecimal digits are converted into bytes, and there is no character set translation performed on the hexadecimal string.

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (

RECORDS DELIMITED BY '|' FIELDS TERMINATED BY ','

                                (first_name CHAR(7),

                                last_name CHAR(8),

                                year_of_birth CHAR(4)))

                         LOCATION ('info.dat'));

数据文件

Alvin,Tolliver,1976|Kenneth,Baer,1963|Mary,Dube,1973

2.4 XMLTAG    --用时再查

2.5 CHARACTERSET

The CHARACTERSET string clause identifies the character set of the data file. If a character set is not specified, then the data is assumed to be in the default character set for the database. The settings of NLS environment variables on the client have no effect on the character set used for the database.

2.6 EXTERNAL VARIABLE DATA --用时再查

The EXTERNAL VARIABLE DATA clause is valid only for use with the Oracle SQL Connector for Hadoop Distributed File System (HDFS). See Oracle Big Data Connectors User's Guide for more information about the Oracle SQL Connector for HDFS.

2.7 PREPROCESSOR  --用时再查

If the file you want to load contains data records that are not in a format supported by the ORACLE_LOADER access driver, then use the PREPROCESSOR clause to specify a user-supplied preprocessor program that will execute for every data file. Note that the program specification must be enclosed in a shell script if it uses arguments

The preprocessor program converts the data to a record format supported by the access driver and then writes the converted record data to standard output (stdout), which the access driver reads as input.

2.8 LANGUAGE   --用时再查

The LANGUAGE clause allows you to specify a language name (for example, FRENCH), from which locale-sensitive information about the data can be derived.

2.9 TERRITORY --用时再查

The TERRITORY clause allows you to specify a territory name to further determine input data characteristics. For example, in some countries a decimal point is used in numbers rather than a comma (for example, 531.298 instead of 531,298).

2.10 DATA IS...ENDIAN

The DATA IS...ENDIAN clause indicates the endianness of data whose byte order may vary depending on the platform that generated the data file. Fields of the following types are affected by this clause:INTEGER, UNSIGNED INTEGER, FLOAT, BINARY_FLOAT, DOUBLE, BINARY_DOUBLE, VARCHAR (numeric count only), VARRAW (numeric count only), Any character data type in the UTF16 character set, Any string specified by RECORDS DELIMITED BY string and in the UTF16 character set

Windows-based platforms generate little-endian data. Big-endian platforms include Sun Solaris and IBM MVS. If the DATA IS...ENDIAN clause is not specified, then the data is assumed to have the same endianness as the platform where the access driver is running. UTF-16 data files may have a mark at the beginning of the file indicating the endianness of the data. This mark will override the DATA IS...ENDIAN clause.

2.11 BYTEORDERMARK (CHECK | NOCHECK)

The BYTEORDERMARK clause is used to specify whether the data file should be checked for the presence of a byte-order mark (BOM). This clause is meaningful only when the character set is Unicode.

BYTEORDERMARK NOCHECK indicates that the data file should not be checked for a BOM and that all the data in the data file should be read as data.

BYTEORDERMARK CHECK indicates that the data file should be checked for a BOM. This is the default behavior for a data file in a Unicode character set.

The following are examples of some possible scenarios:

  1. If the data is specified as being little or big-endian and CHECK is specified and it is determined that the specified endianness does not match the data file, then an error is returned. For example, suppose you specify the following:

DATA IS LITTLE ENDIAN

BYTEORDERMARK CHECK

  1. If the BOM is checked in the Unicode data file and the data is actually big-endian, then an error is returned because you specified little-endian.
  2. If a BOM is not found and no endianness is specified with the DATA IS...ENDIAN parameter, then the endianness of the platform is used.
  3. If BYTEORDERMARK NOCHECK is specified and the DATA IS...ENDIAN parameter specified an endianness, then that value is used. Otherwise, the endianness of the platform is used.

2.12 STRING SIZES ARE IN

The STRING SIZES ARE IN clause is used to indicate whether the lengths specified for character strings are in bytes or characters. If this clause is not specified, then the access driver uses the mode that the database uses. Character types with embedded lengths (such as VARCHAR) are also affected by this clause. If this clause is specified, then the embedded lengths are a character count, not a byte count. Specifying STRING SIZES ARE IN CHARACTERS is needed only when loading multibyte character sets, such as UTF16.

2.13 LOAD WHEN

The LOAD WHEN condition_spec clause is used to identify the records that should be passed to the database. The evaluation method varies:

  1. If the condition_spec references a field in the record, then the clause is evaluated only after all fields have been parsed from the record, but before any NULLIF or DEFAULTIF clauses have been evaluated.
  2. If the condition specification references only ranges (and no field names), then the clause is evaluated before the fields are parsed. This is useful for cases where the records in the file that are not to be loaded cannot be parsed into the current record definition without errors.

LOAD WHEN (empid != BLANKS)

LOAD WHEN ((dept_id = "SPORTING GOODS" OR dept_id = "SHOES") AND total_sales != 0)

2.14 BADFILE | NOBADFILE

The BADFILE clause names the file to which records are written when they cannot be loaded because of errors.

If neither BADFILE nor NOBADFILE is specified, then the default is to create a bad file if at least one record is rejected. The name of the file is the table name followed by _%p, where %p is replaced with the PID of the process creating the file. The file is given an extension of .bad.

If the table name contains any characters that could be interpreted as directory navigation (for example, %, /, or *), then those characters are not included in the output file name.

Records that fail the LOAD WHEN clause are not written to the bad file but are written to the discard file instead. Also, any errors in using a record from an external table (such as a constraint violation when using INSERT INTO...AS SELECT... from an external table) will not cause the record to be written to the bad file.

2.15 DISCARDFILE | NODISCARDFILE

The DISCARDFILE clause names the file to which records are written that fail the condition in the LOAD WHEN clause. The discard file is created when the first record to be discarded is encountered. If the same external table is accessed multiple times, then the discard file is rewritten each time.

2.16 LOGFILE | NOLOGFILE

If a log file already exists by the same name, then the access driver reopens that log file and appends new log information to the end. This is different from bad files and discard files, which overwrite any existing file.

2.17 SKIP

It can be specified only when nonparallel access is being made to the data. If there is more than one data file in the same location for the same table, then the SKIP parameter skips the specified number of records in the first data file only.

2.18 FIELD NAMES

You can use the FIELD NAMES clause to specify field order. The syntax is as follows:

FIELD NAMES {FIRST FILE | FIRST IGNORE | ALL FILES | ALL IGNORE| NONE}

  1. FIRST FILE — Indicates that the first data file contains a list of field names for the data in the first record. This list uses the same delimiter as the data in the data file. This record is read and used to set up the mapping between the fields in the data file and the columns in the target table. This record is skipped when the data is processed. This can be useful if the order of the fields in the data file is different from the order of the columns in the table, or if the number of fields in the data file is different from the number of columns in the target table.
  2. FIRST IGNORE — Indicates that the first data file contains a list of field names for the data in the first record, but that the information should be ignored. This record is skipped when the data is processed, but is not used for setting up the fields.
  3. ALL FILES — Indicates that all data files contain the list of column names for the data in the first record. The first record is skipped in each data file when the data is processed. It is assumed that the list is the same in each data file. If that is not the case, then the load terminates when a mismatch is found on a data file.
  4. ALL IGNORE — Indicates that all data files contain a list of field names for the data in the first record, but that the information should be ignored. This record is skipped when the data is processed in every data file, but it is not used for setting up the fields.
  5. NONE — Indicates that the data file contains normal data in the first record. This is the default option.

2.19 READSIZE

The READSIZE parameter specifies the size of the read buffer used to process records. The size of the read buffer must be at least as big as the largest input record the access driver will encounter. The size is specified with an integer indicating the number of bytes. The default value is 512 KB (524288 bytes). You must specify a larger value if any of the records in the data file are larger than 512 KB. There is no limit on how large READSIZE can be, but practically, it is limited by the largest amount of memory that can be allocated by the access driver.

The amount of memory available for allocation is another limit because additional buffers might be allocated. The additional buffer is used to correctly complete the processing of any records that may have been split (either in the data; at the delimiter; or if multi character/byte delimiters are used, in the delimiter itself).

2.20 DISABLE_DIRECTORY_LINK_CHECK

By default, the ORACLE_LOADER access driver checks before opening data and log files to ensure that the directory being used is not a symbolic link.

The DISABLE_DIRECTORY_LINK_CHECK parameter (which takes no arguments) directs the access driver to bypass this check, allowing you to use files for which the parent directory may be a symbolic link.

Note: Use of this parameter involves security risks because symbolic links can potentially be used to redirect the input/output of the external table load operation.

2.21 DATE_CACHE

By default, the date cache feature is enabled (for 1000 elements). To completely disable the date cache feature, set it to 0.

DATE_CACHE specifies the date cache size (in entries). For example, DATE_CACHE=5000 specifies that each date cache created can contain a maximum of 5000 unique date entries. Every table has its own date cache, if one is needed. A date cache is created only if at least one date or timestamp value is loaded that requires data type conversion in order to be stored in the table.

If the default size is used and the number of unique input values loaded exceeds 1000, then the date cache feature is automatically disabled for that table. However, if you override the default and specify a nonzero date cache size and that size is exceeded, then the cache is not disabled.

You can use the date cache statistics (entries, hits, and misses) contained in the log file to tune the size of the cache for future similar loads.

2.22 string

A string is a quoted series of characters or hexadecimal digits. If it is a series of characters, then those characters will be converted into the character set of the data file. If it is a series of hexadecimal digits, then there must be an even number of hexadecimal digits. The hexadecimal digits are converted into their binary translation, and the translation is treated as a character string in the character set of the data file. This means that once the hexadecimal digits have been converted into their binary translation, there is no other character set translation that occurs. The syntax for a string is as follows:

2.23 condition_spec

The condition_spec is an expression that evaluates to either true or false. It specifies one or more conditions that are joined by Boolean operators. The conditions and Boolean operators are evaluated from left to right. (Boolean operators are applied after the conditions are evaluated.) Parentheses can be used to override the default order of evaluation of Boolean operators. The evaluation of condition_spec clauses slows record processing, so these clauses should be used sparingly. The syntax for condition_spec is as follows:


Note that if the condition specification contains any conditions that reference field names, then the condition specifications are evaluated only after all fields have been found in the record and after blank trimming has been done. It is not useful to compare a field to BLANKS if blanks have been trimmed from the field.

The following are some examples of using condition_spec:

empid = BLANKS OR last_name = BLANKS

(dept_id = SPORTING GOODS OR dept_id = SHOES) AND total_sales != 0

2.24 [directory object name:] [filename]

If the directory object name is omitted, then the value specified for the DEFAULT DIRECTORY clause in the CREATE TABLE...ORGANIZATIONEXTERNAL statement is used.

The filename parameter is the name of the file to create in the directory object. The access driver does some symbol substitution to help make file names unique in parallel loads. The symbol substitutions supported for the UNIX and Windows operating systems are as follows (other platforms may have different symbols):

  1. %p is replaced by the process ID of the current process. For example, if the process ID of the access driver is 12345, then exttab_%p.logbecomes exttab_12345.log.
  2. %a is replaced by the agent number of the current process. The agent number is the unique number assigned to each parallel process accessing the external table. This number is padded to the left with zeros to fill three characters. For example, if the third parallel agent is creating a file and bad_data_%a.bad was specified as the file name, then the agent would create a file named bad_data_003.bad.
  3. %% is replaced by %. If there is a need to have a percent sign in the file name, then this symbol substitution is used.

If the % character is encountered followed by anything other than one of the preceding characters, then an error is returned.

If %p or %a is not used to create unique file names for output files and an external table is being accessed in parallel, then output files may be corrupted or agents may be unable to write to the files.

2.25 condition

A condition compares a range of bytes or a field from the record against a constant string. The source of the comparison can be either a field in the record or a byte range in the record. The comparison is done on a byte-by-byte basis. If a string is specified as the target of the comparison, then it will be translated into the character set of the data file. If the field has a noncharacter data type, then no data type conversion is performed on either the field value or the string. The syntax for a condition is as follows:

This clause describes a range of bytes or characters in the record to use for a condition.

The value used for the STRING SIZES ARE clause determines whether range refers to bytes or characters. If the range refers to parts of the record that do not exist, then the record is rejected when an attempt is made to reference the range.

The data file should not mix binary data (including data types with binary counts, such as VARCHAR) and character data that is in a varying-width character set or more than one byte wide. In these cases, the access driver may not find the correct start for the field, because it treats the binary data as character data when trying to find the start.

LOAD WHEN empid != BLANKS

LOAD WHEN (10:13) = 0x'00000830'

LOAD WHEN PRODUCT_COUNT = "MISSING"

2.26 IO_OPTIONS clause

The IO_OPTIONS clause allows you to specify I/O options used by the operating system for reading the data files.

The only options available for specification are DIRECTIO (the default) and NODIRECTIO.

The DIRECTIO option is used by default, so an attempt is made to open the data file and read it using direct I/O. If successful, then the operating system and NFS server (if the file is on an NFS server) do not cache the data read from the file. This can improve the read performance for the data file, especially if the file is large. If direct I/O is not supported for the data file being read, then the file is opened and read but the DIRECTIO option is ignored.

If the IO_OPTIONS clause is specified with the NODIRECTIO option, then direct I/O is not used to read the data files.

2.27 DNFS_DISABLE | DNFS_ENABLE   --用时查

2.28 DNFS_READBUFFERS    --用时查

  1. field_definitions Clause

In the field_definitions clause you use the FIELDS parameter to name the fields in the data file and specify how to find them in records.

If the field_definitions clause is omitted, then the following is assumed:

  1. The fields are delimited by ','
  2. The fields are of data type CHAR
  3. The maximum length of the field is 255
  4. The order of the fields in the data file is the order in which the fields were defined in the external table
  5. No blanks are trimmed from the field

The syntax for the field_definitions clause is as follows:

IGNORE_CHARS_AFTER_EOR

用于排除某字符This optional parameter specifies that if extraneous characters are found after the last end-of-record but before the end of the file that do not satisfy the record definition, they will be ignored.

Error messages are written to the external tables log file if all four of the following conditions apply:

  1. The IGNORE_CHARS_AFTER_EOR parameter is set or the field allows free formatting. (Free formatting means either that the field is variable length or the field is specified by a delimiter or enclosure characters and is also variable length).
  2. Characters remain after the last end-of-record in the file.
  3. The access parameter MISSING FIELD VALUES ARE NULL is not set.
  4. The field does not have absolute positioning.

The error messages that get written to the external tables log file are as follows:

KUP-04021: field formatting error for field Col1

KUP-04023: field start is after end of record

KUP-04101: record 2 rejected in file /home/oracle/datafiles/example.dat

CSV

To direct external tables to access the data files as comma-separated-values format files, use the FIELDS CSV clause. This assumes that the file is a stream record format file with the normal carriage return string (for example, \n on UNIX or Linux operating systems and either \n or \r\n on Windows operating systems). Record terminators can be included (embedded) in data values. The syntax for the FIELDS CSV clause is as follows:

FIELDS CSV [WITH EMBEDDED | WITHOUT EMBEDDED] [TERMINATED BY ','] [OPTIONALLY ENCLOSED BY '"']

The following are key points regarding the FIELDS CSV clause:

  1. The default is to not use the FIELDS CSV clause.
  2. The WITH EMBEDDED and WITHOUT EMBEDDED options specify whether record terminators are included (embedded) in the data. The WITHEMBEDDED option is the default.
  3. If WITH EMBEDDED is used, then embedded record terminators must be enclosed, and intra-datafile parallelism is disabled for external table loads.
  4. The TERMINATED BY ',' and OPTIONALLY ENCLOSED BY '"' options are the defaults and do not have to be specified. You can override them with different termination and enclosure characters.
  5. When the CSV clause is used, a delimiter specification is not allowed at the field level and only delimitable data types are allowed. Delimitable data types include CHAR, datetime, interval, and numeric EXTERNAL.
  6. The TERMINATED BY and ENCLOSED BY clauses cannot be used at the field level when the CSV clause is specified.
  7. When the CSV clause is specified, the default trimming behavior is LDRTRIM. You can override this by specifying one of the other external table trim options (NOTRIM, LRTRIM, LTRIM, or RTRIM).
  8. The CSV clause must be specified after the IGNORE_CHARS_AFTER_EOR clause and before the delim_spec clause.

ALL FIELDS OVERRIDE

The ALL FIELDS OVERRIDE clause tells the access driver that all fields are present and that they are in the same order as the columns in the external table. You only need to specify fields that have a special definition. This clause must be specified after the optional trim_spec clause and before the optional MISSING FIELD VALUES ARE NULL clause.

The following is a sample use of thee ALL FIELDS OVERRIDE clause. The only field that had to be specified was the hiredate, which required a data format mask. All the other fields took default values.

FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' LDRTRIM

ALL FIELDS OVERRIDE

REJECT ROWS WITH ALL NULL FIELDS

(

 HIREDATE CHAR(20) DATE_FORMAT DATE MASK "DD-Month-YYYY"

)

REJECT ROWS WITH ALL NULL FIELDS

REJECT ROWS WITH ALL NULL FIELDS indicates that a row will not be loaded into the external table if all referenced fields in the row are null. If this parameter is not specified, then the default value is to accept rows with all null fields. The setting of this parameter is written to the log file either as "reject rows with all null fields" or as "rows with all null fields are accepted."

DATE_FORMAT

The DATE_FORMAT clause allows you to specify a datetime format mask once at the fields level.

The datetime format mask must be specified after the optional REJECT ROWS WITH ALL NULL FIELDS clause and before the fields_list clause.

The DATE_FORMAT can be specified for the following datetime types: DATE, TIME, TIME WITH TIME ZONE, TIMESTAMP, and TIMESTAMP WITH TIMEZONE.

FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' LDRTRIM

REJECT ROWS WITH ALL NULL FIELDS

DATE_FORMAT DATE MASK "DD-Month-YYYY"

    (

       EMPNO,

       ENAME,

       JOB,

       MGR,

       HIREDATE CHAR(20)

    )

NULLIF | NO NULLIF

The NULLIF clause applies to all character fields (for example, CHAR, VARCHAR, VARCHARC, external NUMBER, and datetime).

NULLIF {=|!=}{"char_string"|x'hex_string'|BLANKS}

If there is a field to which you do not want the NULLIF clause to apply, you can specify NO NULLIF at the field level (as shown in the following example).

The NULLIF clause must be specified after the optional REJECT ROWS WITH ALL NULL FIELDS clause and before the fields_list clause.

FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' LDRTRIM

REJECT ROWS WITH ALL NULL FIELDS

NULLIF = "NONE"

(

  EMPNO,

  ENAME,

  JOB,

  MGR

 )

field_list Clause

The field_list clause identifies the fields in the data file and their data types:

3.1 delim_spec

The delim_spec clause is used to find the end (and if ENCLOSED BY is specified, the start) of a field. Its syntax is as follows:


跟sqlldr相关语句没区别

Example: External Table with Terminating Delimiters

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (FIELDS TERMINATED BY WHITESPACE)

                         LOCATION ('info.dat'));

Alvin Tolliver 1976

Kenneth Baer 1963

Mary Dube 1973

Example: External Table with Enclosure and Terminator Delimiters

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                        ACCESS PARAMETERS

(FIELDS TERMINATED BY "," ENCLOSED BY "("  AND ")")

                        LOCATION ('info.dat'));

(Alvin) ,   (Tolliver),(1976)

(Kenneth),  (Baer) ,(1963)

(Mary),(Dube) ,   (1973)

Example: External Table with Optional Enclosure Delimiters

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth CHAR(4))

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (FIELDS TERMINATED BY ','

                                            OPTIONALLY ENCLOSED BY '(' and ')'

                                            LRTRIM)

                         LOCATION ('info.dat'));

Alvin ,   Tolliver , 1976

(Kenneth),  (Baer), (1963)

( Mary ), Dube ,    (1973)

3.2 trim_spec

The trim_spec clause is used to specify that spaces should be trimmed from the beginning of a text field, the end of a text field, or both. Spaces include blanks and other nonprinting characters such as tabs, line feeds, and carriage returns. The syntax for the trim_spec clause is as follows:

NOTRIM indicates that no characters will be trimmed from the field.

LRTRIM, LTRIM, and RTRIM are used to indicate that characters should be trimmed from the field. LRTRIM means that both leading and trailing spaces are trimmed. LTRIM means that leading spaces will be trimmed. RTRIM means trailing spaces are trimmed.

LDRTRIM is used to provide compatibility with SQL*Loader trim features. It is the same as NOTRIM except in the following cases:

If the field is not a delimited field, then spaces will be trimmed from the right.

If the field is a delimited field with OPTIONALLY ENCLOSED BY specified, and the optional enclosures are missing for a particular instance, then spaces will be trimmed from the left.

The default is LDRTRIM. Specifying NOTRIM yields the fastest performance.

The trim_spec clause can be specified before the field list to set the default trimming for all fields.  If trimming is specified for a field that is all spaces, then the field will be set to NULL.

In the following example, all data is fixed-length; however, the character data will not be loaded with leading spaces:

CopyCREATE TABLE emp_load

(first_name CHAR(15), last_name CHAR(20),year_of_birth CHAR(4))

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (FIELDS LTRIM)

                         LOCATION ('info.dat'));

Alvin,           Tolliver,1976

Kenneth,         Baer,    1963

Mary,            Dube,    1973

3.3 MISSING FIELD VALUES ARE NULL

The effect of MISSING FIELD VALUES ARE NULL depends on whether POSITION is used to explicitly state field positions:

  1. The default behavior is that if field position is not explicitly stated and there is not enough data in a record for all fields, then the record is rejected. You can override this behavior by using MISSING FIELD VALUES ARE NULL to define as NULL any fields for which there is no data available.
  2. If field position is explicitly stated, then fields for which there are no values are always defined as NULL, regardless of whether MISSING FIELDVALUES ARE NULL is used.

In the following example, the second record is stored with a NULL set for the year_of_birth column, even though the data for the year of birth is missing from the data file. If the MISSING FIELD VALUES ARE NULL clause were omitted from the access parameters, then the second row would be rejected because it did not have a value for the year_of_birth column. The example is followed by a sample of the data file that can be used to load it.

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth INT)

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (FIELDS TERMINATED BY ","

                                            MISSING FIELD VALUES ARE NULL)

                         LOCATION ('info.dat'));

 

Alvin,Tolliver,1976

Baer,Kenneth

Mary,Dube,1973

3.4 field_list

The field_list clause identifies the fields in the data file and their data types. Evaluation criteria for the field_list clause are as follows:

  1. If no data type is specified for a field, then it is assumed to be CHAR(1) for a nondelimited field, and CHAR(255)for a delimited field.
  2. If no field list is specified, then the fields in the data file are assumed to be in the same order as the fields in the external table. The data type for all fields is CHAR(255) unless the column in the database is CHAR or VARCHAR. If the column in the database is CHAR or VARCHAR, then the data type for the field is still CHAR but the length is either 255 or the length of the column, whichever is greater.
  3. If no field list is specified and no delim_spec clause is specified, then the fields in the data file are assumed to be in the same order as fields in the external table. All fields are assumed to be CHAR(255) and terminated by a comma.

This example shows the definition for an external table with no field_list and a delim_spec. It is followed by a sample of the data file that can be used to load it.

CREATE TABLE emp_load (first_name CHAR(15), last_name CHAR(20), year_of_birth INT)

  ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir

                         ACCESS PARAMETERS (FIELDS TERMINATED BY "|")

                         LOCATION ('info.dat'));

Alvin|Tolliver|1976

Kenneth|Baer|1963

Mary|Dube|1973

The syntax for the field_list clause is as follows:


The init_spec clause indicates when a field is NULL or has a default value.

When LLS is specified for a field, ORACLE_LOADER does not load the value of the field into the corresponding column. Instead, it use the information in the value to determine where to find the value of the field.

3.5 pos_spec Clause

The pos_spec clause indicates the position of the column within the record.

CREATE TABLE emp_load (first_name CHAR(15),

                      last_name CHAR(20),

                      year_of_birth INT,

                      phone CHAR(12),

                      area_code CHAR(3),

                      exchange CHAR(3),

                      extension CHAR(4))

  ORGANIZATION EXTERNAL

  (TYPE ORACLE_LOADER

   DEFAULT DIRECTORY ext_tab_dir

   ACCESS PARAMETERS

     (FIELDS RTRIM

            (first_name (1:15) CHAR(15),

             last_name (*:+20),

             year_of_birth (36:39),

             phone (40:52),

             area_code (*-12: +3),

             exchange (*+1: +3),

             extension (*+1: +4)))

   LOCATION ('info.dat'));

Alvin          Tolliver            1976415-922-1982

Kenneth        Baer                1963212-341-7912

Mary           Dube                1973309-672-2341

3.6 datatype_spec Clause

The data type of the field can be different than the data type of a corresponding column in the external table. The access driver handles the necessary conversions. The syntax for the datatype_spec clause is as follows:

If the number of bytes or characters in any field is 0, then the field is assumed to be NULL.

3.6.1 [UNSIGNED] INTEGER [EXTERNAL] [(len)]

This clause defines a field as an integer. If EXTERNAL is specified, then the number is a character string. If EXTERNAL is not specified, then the number is a binary field. The valid values for len in binary integer fields are 1, 2, 4, and 8. If len is omitted for binary integers, then the default value is whatever the value of sizeof(int) is on the platform where the access driver is running. Use of the DATA IS {BIG | LITTLE} ENDIAN clause may cause the data to be byte-swapped before it is stored.

If EXTERNAL is specified, then the value of len is the number of bytes or characters in the number (depending on the setting of the STRING SIZESARE IN BYTES or CHARACTERS clause). If no length is specified, then the default value is 255.

The default value of the [UNSIGNED] INTEGER [EXTERNAL] [(len)] data type is determined as follows:

  1. If no length specified, then the default length is 1.
  2. If no length is specified and the field is delimited with a DELIMITED BY NEWLINE clause, then the default length is 1.
  3. If no length is specified and the field is delimited with a DELIMITED BY clause, then the default length is 255 (unless the delimiter is NEWLINE, as stated above).

3.6.2 DECIMAL [EXTERNAL] and ZONED [EXTERNAL]

The DECIMAL clause is used to indicate that the field is a packed decimal number. The ZONED clause is used to indicate that the field is a zoned decimal number. The precision field indicates the number of digits in the number. The scale field is used to specify the location of the decimal point in the number. It is the number of digits to the right of the decimal point. If scale is omitted, then a value of 0 is assumed.

Note that there are different encoding formats of zoned decimal numbers depending on whether the character set being used is EBCDIC-based or ASCII-based. If the language of the source data is EBCDIC, then the zoned decimal numbers in that file must match the EBCDIC encoding. If the language is ASCII-based, then the numbers must match the ASCII encoding.

If the EXTERNAL parameter is specified, then the data field is a character string whose length matches the precision of the field.

3.6.3 ORACLE_DATE

ORACLE_DATE is a field containing a date in the Oracle binary date format. This is the format used by the DTYDAT data type in Oracle Call Interface (OCI) programs. The field is a fixed length of 7.

3.6.4 ORACLE_NUMBER

ORACLE_NUMBER is a field containing a number in the Oracle number format. The field is a fixed length (the maximum size of an Oracle number field) unless COUNTED is specified, in which case the first byte of the field contains the number of bytes in the rest of the field.

ORACLE_NUMBER is a fixed-length 22-byte field. The length of an ORACLE_NUMBER COUNTED field is one for the count byte, plus the number of bytes specified in the count byte.

3.6.5 Floating-Point Numbers

The following four data types, DOUBLE, FLOAT, BINARY_DOUBLE, and BINARY_FLOAT are floating-point numbers.

DOUBLE and FLOAT are the floating-point formats used natively on the platform in use. They are the same data types used by default for the DOUBLEand FLOAT data types in a C program on that platform. BINARY_FLOAT and BINARY_DOUBLE are floating-point numbers that conform substantially with the Institute for Electrical and Electronics Engineers (IEEE) Standard for Binary Floating-Point Arithmetic, IEEE Standard 754-1985. Because most platforms use the IEEE standard as their native floating-point format, FLOAT and BINARY_FLOAT are the same on those platforms and DOUBLE and BINARY_DOUBLE are also the same.

3.6.6 DOUBLE

The DOUBLE clause indicates that the field is the same format as the C language DOUBLE data type on the platform where the access driver is executing. Use of the DATA IS {BIG | LITTLE} ENDIAN clause may cause the data to be byte-swapped before it is stored. This data type may not be portable between certain platforms.

3.6.7 FLOAT [EXTERNAL]

The FLOAT clause indicates that the field is the same format as the C language FLOAT data type on the platform where the access driver is executing. Use of the DATA IS {BIG | LITTLE} ENDIAN clause may cause the data to be byte-swapped before it is stored. This data type may not be portable between certain platforms.

If the EXTERNAL parameter is specified, then the field is a character string whose maximum length is 255.

3.6.8 BINARY_DOUBLE

BINARY_DOUBLE is a 64-bit, double-precision, floating-point number data type. Each BINARY_DOUBLE value requires 9 bytes, including a length byte. See the information in the note provided for the FLOAT data type for more details about floating-point numbers.

15.3.6.9 BINARY_FLOAT

BINARY_FLOAT is a 32-bit, single-precision, floating-point number data type. Each BINARY_FLOAT value requires 5 bytes, including a length byte. See the information in the note provided for the FLOAT data type for more details about floating-point numbers.

15.3.6.10 RAW

The RAW clause is used to indicate that the source data is binary data. The len for RAW fields is always in number of bytes. When a RAW field is loaded in a character column, the data that is written into the column is the hexadecimal representation of the bytes in the RAW field.

15.3.6.11 CHAR

The CHAR clause is used to indicate that a field is a character data type. The length (len) for CHAR fields specifies the largest number of bytes or characters in the field. The len is in bytes or characters, depending on the setting of the STRING SIZES ARE IN clause.

If no length is specified for a field of data type CHAR, then the size of the field is assumed to be 1, unless the field is delimited:

For a delimited CHAR field, if a length is specified, then that length is used as a maximum.

For a delimited CHAR field for which no length is specified, the default is 255 bytes.

For a delimited CHAR field that is greater than 255 bytes, you must specify a maximum length. Otherwise you will receive an error stating that the field in the data file exceeds maximum length.

The following example shows the use of the CHAR clause.

SQL> CREATE TABLE emp_load

      (employee_number      CHAR(5),  employee_dob         CHAR(20),

       employee_last_name   CHAR(20),  employee_first_name  CHAR(15),

       employee_middle_name CHAR(15),  employee_hire_date   DATE)

    ORGANIZATION EXTERNAL

      (TYPE ORACLE_LOADER

      DEFAULT DIRECTORY def_dir1

      ACCESS PARAMETERS

        (RECORDS DELIMITED BY NEWLINE

         FIELDS (employee_number      CHAR(2),

                 employee_dob        CHAR(20),

                 employee_last_name   CHAR(18),

                 employee_first_name  CHAR(11),

                 employee_middle_name CHAR(11),

                 employee_hire_date   CHAR(10) date_format DATE mask "mm/dd/yyyy"

                )

        )

      LOCATION ('info.dat')

     );

3.6.12 date_format_spec

The date_format_spec clause is used to indicate that a character string field contains date data, time data, or both, in a specific format. This information is used only when a character field is converted to a date or time data type and only when a character string field is mapped into a date column.


If a date mask is not specified, then the settings of NLS parameters for the database (not the session settings) for the appropriate globalization parameter for the data type are used. The NLS_DATABASE_PARAMETERS view shows these settings.

NLS_DATE_FORMAT for DATE data types

NLS_TIMESTAMP_FORMAT for TIMESTAMP data types

NLS_TIMESTAMP_TZ_FORMAT for TIMESTAMP WITH TIME ZONE data types

The database setting for the NLS_NUMERIC_CHARACTERS initialization parameter (that is, from the NLS_DATABASE_PARAMETERS view) governs the decimal separator for implicit conversion from character to numeric data types.

A group separator is not allowed in the default format.

CREATE TABLE emp_load

      (employee_number      CHAR(5),  employee_dob         CHAR(20),

       employee_last_name   CHAR(20),  employee_first_name  CHAR(15),

       employee_middle_name CHAR(15),  employee_hire_date   DATE,

       rec_creation_date    TIMESTAMP WITH TIME ZONE)

    ORGANIZATION EXTERNAL

     (TYPE ORACLE_LOADER

      DEFAULT DIRECTORY def_dir1

      ACCESS PARAMETERS

        (RECORDS DELIMITED BY NEWLINE

         FIELDS (employee_number      CHAR(2),

                 employee_dob         CHAR(20),

                 employee_last_name   CHAR(18),

                 employee_first_name  CHAR(11),

                 employee_middle_name CHAR(11),

                 employee_hire_date CHAR(22) date_format DATE mask "mm/dd/yyyy hh:mi:ss AM",

                 rec_creation_date    CHAR(35) date_format TIMESTAMP WITH TIME ZONE mask "DD-MON-RR HH.MI.SSXFF AM TZH:TZM"

               )

        )

      LOCATION ('infoc.dat')

     );

SQL> SELECT * FROM emp_load;

EMPLO EMPLOYEE_DOB  EMPLOYEE_LAST_NAME   EMPLOYEE_FIRST_ EMPLOYEE_MIDDLE EMPLOYEE_  REC_CREATION_DATE

----- -------------------- -------------------- --------------- ------------------------ ---------------------------------------------------------------------------

56    november, 15, 1980   baker               mary           alice   01-SEP-04   01-DEC-04 11.22.03.034567 AM -08:00

87    december, 20, 1970   roper               lisa            marie   01-JAN-02   01-DEC-02 02.03.00.678573 AM -08:00

2 rows selected.

The info.dat file looks like the following:

56november, 15, 1980  baker             mary       alice      09/01/2004 08:23:01 AM01-DEC-04 11.22.03.034567 AM -08:00

87december, 20, 1970  roper             lisa       marie      01/01/2002 02:44:55 PM01-DEC-02 02.03.00.678573 AM -08:00

3.6.13 VARCHAR and VARRAW

The VARCHAR data type has a binary count field followed by character data. The value in the binary count field is either the number of bytes in the field or the number of characters. See "STRING SIZES ARE IN" for information about how to specify whether the count is interpreted as a count of characters or count of bytes.

The VARRAW data type has a binary count field followed by binary data. The value in the binary count field is the number of bytes of binary data. The data in the VARRAW field is not affected by the DATA IS…ENDIAN clause.

The VARIABLE 2 clause in the ACCESS PARAMETERS clause specifies the size of the binary field that contains the length.

The optional length_of_length field in the specification is the number of bytes in the count field. Valid values for length_of_length for VARCHAR are 1, 2, 4, and 8. If length_of_length is not specified, then a value of 2 is used. The count field has the same endianness as specified by the DATA IS…ENDIAN clause.

The max_len field is used to indicate the largest size of any instance of the field in the data file. For VARRAW fields, max_len is number of bytes. For VARCHAR fields, max_len is either number of characters or number of bytes depending on the STRING SIZES ARE IN clause.

The following example shows various uses of VARCHAR and VARRAW. The content of the data file, info.dat, is shown following the example.

CREATE TABLE emp_load

             (first_name CHAR(15),  last_name CHAR(20),

              resume CHAR(2000),  picture RAW(2000))

  ORGANIZATION EXTERNAL

  (TYPE ORACLE_LOADER

   DEFAULT DIRECTORY ext_tab_dir

   ACCESS PARAMETERS

     (RECORDS

        VARIABLE 2

        DATA IS BIG ENDIAN

        CHARACTERSET US7ASCII

        FIELDS (first_name VARCHAR(2,12),

              last_name VARCHAR(2,20),

              resume VARCHAR(4,10000),

              picture VARRAW(4,100000)))

    LOCATION ('info.dat'));

Contents of info.dat Data File

The contents of the data file used in the example are as follows:.

0005Alvin0008Tolliver0000001DAlvin Tolliver's Resume etc. 0000001013f4690a30bc29d7e40023ab4599ffff

It is important to understand that, for the purposes of readable documentation, the binary values for the count bytes and the values for the raw data are shown in the data file in italics, with 2 characters per binary byte. The values in an actual data file would be in binary format, not ASCII. Therefore, if you attempt to use this example by cutting and pasting, then you will receive an error.

3.6.14 VARCHARC and VARRAWC

The VARCHARC data type has a character count field followed by character data. The value in the count field is either the number of bytes in the field or the number of characters. See "STRING SIZES ARE IN" for information about how to specify whether the count is interpreted as a count of characters or count of bytes. The optional length_of_length is either the number of bytes or the number of characters in the count field for VARCHARC, depending on whether lengths are being interpreted as characters or bytes.

The maximum value for length_of_lengths for VARCHARC is 10 if string sizes are in characters, and 20 if string sizes are in bytes. The default value for length_of_length is 5.

The VARRAWC data type has a character count field followed by binary data. The value in the count field is the number of bytes of binary data. The length_of_length is the number of bytes in the count field.

The max_len field is used to indicate the largest size of any instance of the field in the data file. For VARRAWC fields, max_len is number of bytes. For VARCHARC fields, max_len is either number of characters or number of bytes depending on the STRING SIZES ARE IN clause.

The following example shows various uses of VARCHARC and VARRAWC. The length of the picture field is 0, which means the field is set to NULL.

CREATE TABLE emp_load

             (first_name CHAR(15),

              last_name CHAR(20),

              resume CHAR(2000),

              picture RAW (2000))

  ORGANIZATION EXTERNAL

  (TYPE ORACLE_LOADER

    DEFAULT DIRECTORY ext_tab_dir

    ACCESS PARAMETERS

      (FIELDS (first_name VARCHARC(5,12),

               last_name VARCHARC(2,20),

               resume VARCHARC(4,10000),

               picture VARRAWC(4,100000)))

  LOCATION ('info.dat'));

00007William05Ricca0035Resume for William Ricca is missing0000

3.7 init_spec Clause

The init_spec clause is used to specify when a field should be set to NULL or when it should be set to a default value:


Only one NULLIF clause and only one DEFAULTIF clause can be specified for any field. These clauses behave as follows:

  1. If NULLIF condition_spec is specified and it evaluates to TRUE, then the field is set to NULL.
  2. If DEFAULTIF condition_spec is specified and it evaluates to TRUE, then the value of the field is set to a default value. The default value depends on the data type of the field, as follows:

For a character data type, the default value is an empty string.

For a numeric data type, the default value is a 0.

For a date data type, the default value is NULL.

  1. If a NULLIF clause and a DEFAULTIF clause are both specified for a field, then the NULLIF clause is evaluated first and the DEFAULTIF clause is evaluated only if the NULLIF clause evaluates to FALSE.

3.8 LLS Clause --用时查

If a field in a data file is a LOB location Specifier (LLS) field, then you can indicate this by using the LLS clause. An LLS field contains the file name, offset, and length of the LOB data in the data file. SQL*Loader uses this information to read data for the LOB column.

  1. column_transforms Clause


Note: The COLUMN TRANSFORMS clause does not work in conjunction with the PREPROCESSOR clause.

4.1 transform

4.1.1 column_name FROM

用于指定列名,与表列要一致

4.1.2 NULL

When the NULL transform is specified, every value of the field is set to NULL for every record.

4.1.3 CONSTANT

The CONSTANT transform uses the value of the string specified as the value of the column in the record.

4.1.4 CONCAT

The CONCAT transform concatenates constant strings and fields in the data file together to form one string. Only fields that are character data types and that are listed in the fields clause can be used as part of the concatenation. Other column transforms cannot be specified as part of the concatenation.

4.1.5 LOBFILE

The LOBFILE transform is used to identify a file whose contents are to be used as the value for a column in the external table. All LOBFILEs are identified by an optional directory object and a file name in the form directory object:filename. The following rules apply to use of the LOBFILE transform:

  1. Both the directory object and the file name can be either a constant string or the name of a field in the field clause.
  2. If a constant string is specified, then that string is used to find the LOBFILE for every row in the table.
  3. If a field name is specified, then the value of that field in the data file is used to find the LOBFILE.
  4. If a field name is specified for either the directory object or the file name and if the value of that field is NULL, then the column being loaded by the LOBFILE is also set to NULL.
  5. If the directory object is not specified, then the default directory specified for the external table is used.
  6. If a field name is specified for the directory object, then the FROM clause also needs to be specified.

Note that the entire file is used as the value of the LOB column. If the same file is referenced in multiple rows, then that file is reopened and reread in order to populate each column.

4.1.6 lobfile_attr_list


The FROM clause lists the names of all directory objects that will be used for LOBFILEs. It is used only when a field name is specified for the directory object of the name of the LOBFILE. The purpose of the FROM clause is to determine the type of access allowed to the named directory objects during initialization. If directory object in the value of field is not a directory object in this list, then the row will be rejected.

The CLOB attribute indicates that the data in the LOBFILE is character data (as opposed to RAW data). Character data may need to be translated into the character set used to store the LOB in the database.

The CHARACTERSET attribute contains the name of the character set for the data in the LOBFILEs.

The BLOB attribute indicates that the data in the LOBFILE is raw data.

If neither CLOB nor BLOB is specified, then CLOB is assumed. If no character set is specified for character LOBFILEs, then the character set of the data file is assumed.

15.4.1.7 STARTOF source_field (length)

The STARTOF keyword allows you to create an external table in which a column can be a substring of the data in the source field.

The length is the length of the substring, beginning with the first byte. It is assumed that length refers to a byte count and that the external table column(s) being transformed use byte length and not character length semantics. (Character length semantics might give unexpected results.)

Only complete character encodings are moved; characters are never split. So if a substring ends in the middle of a multibyte character, then the resulting string will be shortened. For example, if a length of 10 is specified, but the 10th byte is the first byte of a multibyte character, then only the first 9 bytes are returned.

The following example shows how you could use the STARTOF keyword if you only wanted the first 4 bytes of the department name (dname) field:

CREATE TABLE dept (deptno  NUMBER(2),

                      dname   VARCHAR2(14),

                      loc     VARCHAR2(13)

                           )

    ORGANIZATION EXTERNAL

    (

      DEFAULT DIRECTORY def_dir1

      ACCESS PARAMETERS

      (

       RECORDS DELIMITED BY NEWLINE

       FIELDS TERMINATED BY ','

       (

         deptno           CHAR(2),

         dname_source     CHAR(14),

         loc              CHAR(13)

       )

       column transforms

       (

          dname FROM STARTOF dname_source (4)

       )

     )

     LOCATION ('dept.dat')

   );

If you now perform a SELECT operation from the dept table, only the first four bytes of the dname field are returned:

SQL> SELECT * FROM dept;

    DEPTNO DNAME          LOC

---------- -------------- -------------

        10 ACCO           NEW YORK

        20 RESE            DALLAS

        30 SALE            CHICAGO

        40 OPER           BOSTON

  1. Parallel Loading Considerations for the ORACLE_LOADER Access Driver

The ORACLE_LOADER access driver attempts to divide large data files into chunks that can be processed separately. The following file, record, and data characteristics make it impossible for a file to be processed in parallel:

  1. Sequential data sources (such as a tape drive or pipe)
  2. Data in any multibyte character set whose character boundaries cannot be determined starting at an arbitrary byte in the middle of a string

This restriction does not apply to any data file with a fixed number of bytes per record.

  1. Records with the VAR format

Specifying a PARALLEL clause is of value only when large amounts of data are involved.

  1. Performance Hints When Using the ORACLE_LOADER Access Driver

When you monitor performance, the most important measurement is the elapsed time for a load. Other important measurements are CPU usage, memory usage, and I/O rates.

You can alter performance by increasing or decreasing the degree of parallelism. The degree of parallelism indicates the number of access drivers that can be started to process the data files. The degree of parallelism enables you to choose on a scale between slower load with little resource usage and faster load with all resources utilized. The access driver cannot automatically tune itself, because it cannot determine how many resources you want to dedicate to the access driver.

An additional consideration is that the access drivers use large I/O buffers for better performance (you can use the READSIZE clause in the access parameters to specify the size of the buffers). On databases with shared servers, all memory used by the access drivers comes out of the system global area (SGA). For this reason, you should be careful when using external tables on shared servers.

Performance can also sometimes be increased with use of date cache functionality. By using the date cache to specify the number of unique dates anticipated during the load, you can reduce the number of date conversions done when many duplicate date or timestamp values are present in the input data. The date cache functionality provided by external tables is identical to the date cache functionality provided by SQL*Loader.

In addition to changing the degree of parallelism and using the date cache to improve performance, consider the following information:

  1. Fixed-length records are processed faster than records terminated by a string.
  2. Fixed-length fields are processed faster than delimited fields.
  3. Single-byte character sets are the fastest to process.
  4. Fixed-width character sets are faster to process than varying-width character sets.
  5. Byte-length semantics for varying-width character sets are faster to process than character-length semantics.
  6. Single-character delimiters for record terminators and field delimiters are faster to process than multicharacter delimiters.
  7. Having the character set in the data file match the character set of the database is faster than a character set conversion.
  8. Having data types in the data file match the data types in the database is faster than data type conversion.
  9. Not writing rejected rows to a reject file is faster because of the reduced overhead.
  10. Condition clauses (including WHENNULLIF, and DEFAULTIF) slow down processing
  11. The access driver takes advantage of multithreading to streamline the work as much as possible.

  1. Restrictions When Using the ORACLE_LOADER Access Driver

This section lists restrictions to be aware of then you use the ORACLE_LOADER access driver.

  1. Exporting and importing of external tables with encrypted columns is not supported.
  2. Column processing: By default, the external tables feature fetches all columns defined for an external table. This guarantees a consistent result set for all queries. However, for performance reasons you can decide to process only the referenced columns of an external table, thus minimizing the amount of data conversion and data handling required to execute a query. In this case, a row that is rejected because a column in the row causes a data type conversion error will not get rejected in a different query if the query does not reference that column. You can change this column-processing behavior with the ALTER TABLE command.
  3. An external table cannot load data into a LONG column.
  4. SQL strings cannot be specified in access parameters for the ORACLE_LOADER access driver. As a workaround, you can use the DECODE clause in the SELECT clause of the statement that is reading the external table. Alternatively, you can create a view of the external table that uses the DECODE clause and select from that view rather than the external table.
  5. The use of the backslash character (\) within strings is not supported in external tables. See "Use of the Backslash Escape Character".
  6. When identifiers (for example, column or table names) are specified in the external table access parameters, certain values are considered to be reserved words by the access parameter parser. If a reserved word is used as an identifier, then it must be enclosed in double quotation marks.
  1. Reserved Words for the ORACLE_LOADER Access Driver

When identifiers (for example, column or table names) are specified in the external table access parameters, certain values are considered to be reserved words by the access parameter parser. If a reserved word is used as an identifier, then it must be enclosed in double quotation marks. The following are the reserved words for the ORACLE_LOADER access driver:

ALL,AND,ARE,ASTERISK,AT, ATSIGN, BADFILE, BADFILENAME, BACKSLASH, BENDIAN, BIG, BLANKS, BY, BYTES, BYTESTR, CHAR, CHARACTERS, CHARACTERSET, CHARSET, CHARSTR, CHECK, CLOB, COLLENGTH, COLON, COLUMN, COMMA, CONCAT, CONSTANT, COUNTED, DATA, DATE, DATE_CACHE, DATE_FORMAT, DATEMASK, DAY, DEBUG, DECIMAL, DEFAULTIF, DELIMITBY, DELIMITED, DISCARDFILE, DNFS_ENABLE, DNFS_DISABLE, DNFS_READBUFFERS, DOT, DOUBLE, DOUBLETYPE, DQSTRING, DQUOTE, DSCFILENAME, ENCLOSED, ENDIAN, ENDPOS, EOF, EQUAL, EXIT, EXTENDED_IO_PARAMETERS, EXTERNAL, EXTERNALKW, EXTPARM, FIELD, FIELDS, FILE, FILEDIR, FILENAME, FIXED, FLOAT, FLOATTYPE, FOR, FROM, HASH, HEXPREFIX, IN, INTEGER, INTERVAL, LANGUAGE, IS, LEFTCB, LEFTTXTDELIM, LEFTP, LENDIAN, LDRTRIM, LITTLE, LOAD, LOBFILE, LOBPC, LOBPCCONST, LOCAL, LOCALTZONE, LOGFILE, LOGFILENAME, LRTRIM, LTRIM, MAKE_REF, MASK, MINUSSIGN, MISSING, MISSINGFLD, MONTH, NEWLINE, NO, NOCHECK, NOT, NOBADFILE, NODISCARDFILE, NOLOGFILE, NOTEQUAL, NOTERMBY, NOTRIM, NULL, NULLIF, OID, OPTENCLOSE, OPTIONALLY, OPTIONS, OR, ORACLE_DATE, ORACLE_NUMBER, PLUSSIGN, POSITION, PROCESSING, QUOTE, RAW, READSIZE, RECNUM,RECORDS, REJECT, RIGHTCB, RIGHTTXTDELIM,RIGHTP, ROW, ROWS, RTRIM, SCALE, SECOND, SEMI, SETID, SIGN, SIZES, SKIP, STRING, TERMBY, TERMEOF, TERMINATED, TERMWS, TERRITORY, TIME, TIMESTAMP, TIMEZONE, TO, TRANSFORMS, UNDERSCORE, UINTEGER, UNSIGNED, VALUES, VARCHAR, VARCHARC, VARIABLE, VARRAW, VARRAWC, VLENELN, VMAXLEN, WHEN, WHITESPACE, WITH, YEAR, ZONED

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值