1. 准备工作
一个简单的job,一个简单的trans。
trans:读取download目录下的所有文件名,输出为文件。【界面情况下测试成功】
成功生成目标文件:
job:创建文件。【界面模式测试执行成功】
执行结果:
把界面执行测试结果文件删除,以免影响观察。
2. linux环境以命令行方式执行job和trans
Pan是用于执行trans的PDI命令行工具。
Kitchen是用于执行作业的PDI命令行工具。
1
2
a. Pan的命令行选项和语法
语法:
pan.sh -option=value arg1 arg2
1
命令行参数:
SwitchPurpose
rep
Enterprise or database repository name, if you are using one
user
Repository username
pass
Repository password
trans
The name of the transformation (as it appears in the repository) to launch
dir
The repository directory that contains the transformation, including the leading slash
file
If you are calling a local KTR file, this is the filename, including the path if it is not in the local directory
level
The logging level (Basic, Detailed, Debug, Rowlevel, Error, Nothing)
logfile
A local filename to write log output to
listdir
Lists the directories in the specified repository
listtrans
Lists the transformations in the specified repository directory
listrep
Lists the available repositories
exprep
Exports all repository objects to one XML file
norep
Prevents Pan from logging into a repository. If you have set the KETTLE_REPOSITORY, KETTLE_USER, and KETTLE_PASSWORD environment variables, then this option will enable you to prevent Pan from logging into the specified repository, assuming you would like to execute a local KTR file instead.
safemode
Runs in safe mode, which enables extra checking
version
Shows the version, revision, and build date
param
Set a named parameter in a name=value format. For example: -param:FOO=bar
listparam
List information about the defined named parameters in the specified transformation.
maxloglines
The maximum number of log lines that are kept internally by PDI. Set to 0 to keep all rows (default)
maxlogtimeout
The maximum age (in minutes) of a log line while being kept internally by PDI. Set to 0 to keep all rows indefinitely (default)
示例:
sh pan.sh -rep=initech_pdi_repo -user=pgibbons -pass=lumburghsux -trans=TPS_reports_2011
1
本地trans调用示例:
./pan.sh -file=/home/hadoop/workplace/kettle/trans/test_cml.ktr -norep
1
b.Kitchen的命令行参数及语法:
语法与Pan一样,参数有点不同。
Switchurpose
rep
Enterprise or database repository name, if you are using one
user
Repository username
pass Repository
password
job
The name of the job (as it appears in the repository) to launch
dir
The repository directory that contains the job, including the leading slash
file
If you are calling a local KJB file, this is the filename, including the path if it is not in the local directory
level
The logging level (Basic, Detailed, Debug, Rowlevel, Error, Nothing)
logfile
A local filename to write log output to
listdir
Lists the sub-directories within the specified repository directory
listjob
Lists the jobs in the specified repository directory
listrep
Lists the available repositories
export
Exports all linked resources of the specified job. The argument is the name of a ZIP file.
norep
Prevents Kitchen from logging into a repository. If you have set the KETTLE_REPOSITORY, KETTLE_USER, and KETTLE_PASSWORD environment variables, then this option will enable you to prevent Kitchen from logging into the specified repository, assuming you would like to execute a local KTR file instead.
version
Shows the version, revision, and build date
param
Set a named parameter in a name=value format. For example: -param:FOO=bar
listparam
List information about the defined named parameters in the specified job.
maxloglines
The maximum number of log lines that are kept internally by PDI. Set to 0 to keep all rows (default)
maxlogtimeout
The maximum age (in minutes) of a log line while being kept internally by PDI. Set to 0 to keep all rows indefinitely (default)
执行本地job的命令行语句:
/home/kettle/data-integration/kitchen.sh -file=/home/kettle/transition/move.kjb -log=log.log
1
形式:
$kitchen路径 -file=$job路径 log=$log路径
1
调用pan结果:
调用kitchen结果:
3.个人常用命令选项
由于我当前的工作环境都是执行本地的job和trans文件,所以常用的命令选项有:
命令描述
-file
job或trans文件路径
-norep
标明不是资源库里的文件
-param
参数设置
-logfile
log输出文件名
-level
log级别 (Basic, Detailed, Debug, Rowlevel, Error, Nothing)