Effectively Moving SAS Data into Teradata

1. Environment Requirements

-Access to a Teradata environment on which to execute queries.
- A userid and password for the database in question.
- Privileges in the Teradata environment that will allow you to load data into the tables.
- SAS 9.2 (or 9.1.3) software installed (including the SAS/ACCESS Interface to Teradata).
- Your SAS environment properly configured so that it will attach to Teradata.
- Access to the Teradata Tools and Utilities (TTU).
- Access to the Teradata Analysis Pak.

2.Specify the Primary Index Using SAS

DBCREATE_TABLE_OPTS=’DBMS-SQL-clauses’
This option enables you to add DBMS-specific clauses to the end of SAS generated CREATE TABLE statements.

Example:

LIBNAME mytera TERADATA USER=tduser PASSWORD=tdpasswd SERVER=TDServ ;
data work.test;
a='aaaaa'; b=1; output; 
a='aaaaa'; b=2; output; 
a='aaaaa'; b=3; output; 
run;
data mytera.test (DBCREATE_TABLE_OPTS='PRIMARY INDEX (B)'); 
set work.test; 
run;
In this example a Teradata table will be created based on the SAS data set work.test. If we don‟t use the DBCREATE_TABLE_OPTS= option then “column a” will be chosen as the primary index. That will pose a problem because of the distribution.

3.SAS insert data into Teradata

EX:

PROC SQL;
CONNECT TO TERADATA(USER=myuserid PASSWORD=mypassword SERVER=terasrv);
EXECUTE (INSERT INTO TESTTAB VALUES (1,‟Christine‟,‟Lewis‟) BY TERADATA;
EXECUTE (INSERT INTO TESTTAB VALUES (2,‟Jack‟,‟Johnson‟) BY TERADATA;
EXECUTE (INSERT INTO TESTTAB VALUES (3,‟Joe‟,‟Jackson‟) BY TERADATA;
EXECUTE (INSERT INTO TESTTAB VALUES (6,‟Larry‟,‟Nomar‟) BY TERADATA;
EXECUTE (INSERT INTO TESTTAB VALUES (5,‟Robert‟,‟Bob‟) BY TERADATA;
QUIT;

LIBNAME mytera TERADATA USER=myuserid PASSWORD=mypassword SERVER=terasrv;
LIBNAME sasdata „C:\sasdata‟;
PROC SQL;
CREATE TABLE mytera.transaction_file AS
SELECT * FROM sasdata.transaction_file;
QUIT;

4.Batch INSERT Statements to Increase Performance

MULTISTMT=NO|YES
NO – sends inserts one row at a time. This is the default.
YES – sends as many inserts as will fit into a 64K buffer. If multi-statement inserts are not possible then single statement inserts will be used.
By default, this commits database changes after the last INSERT statement

5.Commit Points to Avoid Locking Problems

DBCOMMIT=n
n – specifies an integer greater than or equal to 0
If you set this to 0 it commits at the end of the job. This can cause problems with rowhash locking.
You will need to experiment with this value to determine the best value for your situation. More on this below.

6.INSERT Statement Performance Considerations

-There is a point of diminishing returns in setting DBCOMMIT= (bigger is not necessarily better).
- MULTISTMT=YES increases performance

7.Moving Data Within a Teradata Server

Fast Path:

DBIDIRECTEXEC= option
EX:
OPTIONS SASTRACE=‟,,,d‟ SASTRACELOC=SASLOG NOSTSUFFIX;
LIBNAME mytera TERADATA USER=myuserid PASSWORD=mypassword
SERVER=terasrv DATABASE=PRODDB;
OPTIONS DBIDIRECTEXEC;
PROC SQL;
CREATE TABLE mytera.new_transaction_file AS
SELECT * from mytera.transaction_file;
QUIT;
SASTRACE= option shows us the SQL statement that created the table and inserted data into the new table was executed by Teradata.

8.Inserting Rows into an Existing Teradata Table

EX:

OPTIONS SASTRACE=‟,,,d‟ SASTRACELOC=SASLOG NOSTSUFFIX;
PROC SQL;
CONNECT TO TERADATA (USER=username PASSWORD=password SERVER=myserver);
EXECUTE (INSERT INTO TESTDB.NEW_TABLE
SELECT * FROM TESTDB.OLD_TABLE) BY TERADATA;
QUIT;

9.Loading SAS Data Into Teradata (FastLoad)

-The target table must be empty.
-The target table must have no secondary indexes, join indexes, or hash indexes defined on it.
-The target table must have no triggers defined on it.
-The target table must have no standard referential integrity or batch referential integrity defined on it..
-Duplicate rows cannot be loaded.


 

9.1 BULKLOAD=

BULKLOAD=YES|NO
YES – Teradata FastLoad protocol is used when appending or inserting data into the Teradata table.
NO – The FastLoad protocol is not used when appending or inserting data into the Teradata table.
FASTLOAD= is an alias for BULKLOAD=

EX:

LIBNAME mytera1 TERADATA USER=myuserid PASSWORD=mypassword SERVER=terasrv DATABASE=TESTDB BULKLOAD=YES;
PROC APPEND BASE=MYTERA1.CUSTOMERS
DATA=WORK.NEW_CUSTOMERS;
RUN;
9.2 BL_LOG=

BL_LOG= a meaningful table name
-To specify this option you must specify BULKLOAD=YES.
-The Teradata bulk-load facility will create error tables based on this name. Use meaningful names.
-For example, if you specify BL_LOG=CUSTOMER_ERR then errors are logged in CUSTOMER_ERR1 and CUSTOMER_ERR2.
-You can place the utility tables in a separate database by prepending the database name and a period to the value specified in the BL_LOG= option. Let‟s assume that we need to put the utility tables in the TEMPUTIL Teradata database and we want the table names to all start with INITIAL_LD_. Here is what our BL_LOG= statement would look like. BL_LOG=TEMPUTIL.INITIAL_LD

9.3FBUFSIZE=

BUFSIZE= Integer from 1 to 66750
The default value appears to be 64000. I discovered the maximum value through experimentation. We can find no documentation which clearly states the default value.
Setting FBUFSIZE= to larger values appears to be better. Finding an acceptable value for FBUFSIZE= may take experimentation. I was able to get the code to run when this option is set to 1, but it ran very slowly. The best results appear when you have this option set towards the higher end of the range.

9.4 DBCOMMIT=

DBCOMMIT=n

DBCOMMIT= will cause a Teradata checkpoint every n rows. If you use BULKLOAD=YES but do not specify a value for DBCOMMIT= then no checkpoints are used. This is the same as setting DBCOMMIT=0. The commit occurs at the end of the job.
Using a value of 0 means COMMIT after the table is loaded. This works well on smaller loads.

9.5 Loading Teradata Tables Using SAS  Example

PROC APPEND BASE=mytera2.cm_financial_account_new (FASTLOAD=YES FBUFSIZE=66750 BL_LOG=TEST_LOAD) DATA=work.cm_financial_account; 
RUN;

9.6 Loading Problems (FastLoad)

-FastLoad Into a Non-Empty Table

-FastLoad a Table With a Join Index Defined on it

-FastLoad a Table With a Hash Index Defined on it

-FastLoad a Table Having RI Defined on it

-FastLoad a Table With FBUFSIZE= Set Too High

10Loading SAS Data into Teradata (MultiLoad)

-The target table must have no unique secondary, join, or hash indexes defined on it.
-The target table must have no triggers defined on it.
-The target table must have no standard referential integrity or batch referential integrity defined on it (Soft Referential Integrity is allowed).
-The MultiLoad input file must have data to qualify all columns defined in the Primary Index of the target table.


There are differences between FastLoad and MultiLoad.



More details:

https://support.sas.com/resources/papers/EffectivelyMovingSASDataintoTeradata.pdf

Some ref:

https://support.sas.com/resources/papers/proceedings11/103-2011.pdf




  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值