How to Create, Use and Maintain DataStage 8 Parameter Sets

转载 2008年09月19日 23:30:00

This is a three part DataStage tutorial on using the new version 8 Parameter Set functionality that shows how it works and adds some practical advice for how to use it.

DataStage 8.0.1 came with a great new function called Parameter Sets that let you group your DataStage and QualityStage job parameters and store default values in files.

There are some very good things about Parameter Sets:

  • You can group a set of related parameters together and add them in one click to any Sequence or Server or Parallel DataStage job and maintain them from a single parameter definition.
  • Parameter Sets make Sequence jobs and Shared Containers easier and faster to build.  When you add a Server or Parallel job to a Sequence Job you have a lot less properties to set if both jobs are synchronising parameters via Parameter Sets.  When you add a Shared Container to a job you only need to link in the Parameter Set name and not each individual parameter.
  • You take the storage of parameter values completely out of the hands of individual sequence, server or parallel jobs and put them into centralised files or objects.
  • Parameter Help Text is given a new life.  This field that is either blank or filled with rubbish becomes more important with centralised Parameter Sets as you can turn your Parameter Set into a type of technical glossary of terms that not only lists a set of parameters but offers a good definition of each.

I like the second benefit, an average project could have 10-20 job parameters per job once you have file, database, processing, date and source system job parameters.  When you add a job onto a Sequence job canvas you've got to pass through every bloody parameter value via manual clicks, it takes 20 clicks to add and configure a job.  Time consuming when you just want to throw together a Sequence job for some testing.  With the arrival of a ParameterSet you do one value per ParameterSet rather than one value per Parameter - many less clicks.

There are a couple drawbacks to ParameterSets that we will expand on later:

  • There still isn't a good interface for maintaining parameter values for support staff.  ParameterSet values can be modified directly against a text file but not if the value is encrypted.  The only GUI or browser interfaces for maintaining parameter values is the Designer or Administrator tool - neither of which is easy to use for a part time support person.
  • There is no obvious place to save ParameterSets in the repository, you have to choose or create a home for them and this could lead to a bit of disorganisation.  ParamterSet files can only be saved under a DataStage Project directory when it would be better to be able to browse and choose a location that is easier for support staff to find.
  • The ParameterSet object doesn't allow comments or description or any type of help text.

Creating Parameter Sets

A Parameter Set is a way to group a set of parameters into a single re-usable definition that you can add to your jobs.  Parameter sets let you save the values for a set in a file or in an object in the DataStage repository.

To create a Parameter Set use the "New" menu or toolbar option and find Parameter Sets under the "Other" or "Recently Used" folders.  Choose a short Parameter Set name because you will need to use it throughout your job as a parameter prefix: #parameter_set.parameter_name#.  Your parameter set name should be short and sweet and your parameter names can be longer and descriptive.

In the Add Parameter Set form type the parameters that belong in a single set into the Add Parameter Set screen just like normal job parameters, note that you have two fields for operators and/or developers, one for the Prompt you see when you run the job and one for the Help Text when you click "Property Help" from the Job Run screen:


I suggest these Prompt and Help Text fields be used as technical instructions on how to use the parameter.

On the "Values" tab you can specify one or more files to save values to and by default it copies across the values from the previous tab.  This is an optional tab, you can put values into a file or keep the values in the Parameter Set repository object:

DataStage Add Parameter Set

You can add more than one file.  The columns you see are the parameter values copied across from the previous tab.  When you specify multiple files you are creating multiple scenarios to be selected at run time.  A job can be run with a different set of file values depending on the parameter file name passed into the job or selected from a drop down list by the operator.

You could use this feature if you are in a dev or test environment and you had multiple source databases to choose from, say a small database for a quick test or a full database for a performance test.  One file could be called DB_SMALL_DEV and the other DB_LARGE_DEV with the DB connection values in each.

I'm not sure what use it could have in production where you want to be quite certain about what parameters a job should use and don't want an operator trying to choose the right file.

When you save your Parameter Set you choose a location in your DataStage repository for them to be saved into the repository.  So I don't lose them I create a "ParameterSets" folder under the Job folder to save them close to the DataStage jobs.  In DataStage 8.0.1 you can save Parameter Sets anywhere, you can save them in Jobs or Table Definitions or even Stage Types.  There are lots of stupid places to save them so create a folder for them that makes sense and put them all there. 

Under the covers DataStage saves the option Parameter Set Values in the location $PROJECTHOME/ParameterSets/ParameterSetName/ValueFileName.  The Values file is a plain text file where encrypted values are converted to printable mashed text:


You can see that this file would be easy to modify directly or manage from a simple GUI application - except for the encrypted password value.

When you run the job that uses a Parameter Set you get three choices for choosing job parameters values:

1. Parameter Set with File Values File Values - you run the job with parameter values from the Parameter Set file by choosing the Parameter Set file name from the "Value" field next to the Parameter Set name.  At this point if you have multiple files to chose from (eg. multiple source databases or source instances) you can choose the right file.


2. Parameter Set with pre-defined values Parameter Set object Values - you can go with the values stored on the Parameter Set object in the repository by choosing "pre-defined" from the Value field next to the Parameter Set name.

3. User override - the person running the job can override the value of any parameter by typing in a value next to the individual parameter names.  This works with either file or pre-defined usage shown above.

One of the biggest improvements brought by Parameter Sets is the simplification of Sequence Jobs.  In DataStage 6 you had to add a Server/Parallel and then set every single parameter value which could be quite time consuming if you had 20 or more parameters per job.  You could then copy and paste the stage and switch job names and keep those parameter values.  In DataStage 7.5.x they took away this copy and paste feature and each time you changed the job name on a job stage the parameter values got wiped out.  So for every job you added to a Sequence job you had to painstakingly set all the job parameter values - there were no shortcuts, no auto mapping or auto setting.

Parameter Sets make Sequence Jobs easy again by only requiring you to set a default behaviour for the Parameter Set to either "User-Defined" (take it from the Parameter Set object) or File (take it from the Parameter Set file):

DataStage Sequence Job Job Activity Stage

If you have parameters that you want to dynamically generate - such as processing dates, last key used, process id - you would set this up as a normal job parameter in the parallel job so it can be retrieved and set in the Sequence job and passed in as a parameter override. 

Using Parameters from Parameter Sets

When you use a Parameter Set parameter in a job you refer to it using the Parameter Set name as a prefix: #parameter_set.parameter_name#.  When a stage property window has an "Insert Parameter" button with a popup list of parameters you will see the list with both the parameter set name and parameter name in alphabetic order so it's easy to scroll to the right parameter set.

When you call a job from the command line you can specify whether to set Parameter Set values from the object or from the file.  To override values you need to refer to the full parameter name: dsjob -run -param ParameterSetName=ParameterFileName -param ParameterSet.ParameterName = OverrideValue

If you try to override an encrypted parameter value from outside of a DataStage product you will lose the encryption - the override will work but the value will show up in the DataStage log in plain text. 

Maintaining Parameter Sets

The major drawback of Parameter Sets is that a DataStage Support person who might have only a passing knowledge of the tools has no easy way to change encrypted values in the Parameter Set file.  The primary tool of use for a DataStage Support Dude is the Director - and unfortunately IBM have forgotten to put any Parameter Set maintenance tools into the Director.  You can set Job Parameter Defaults in Director but not if those Parameters are in a Parameter Set.  Parameter Sets are kind of invisible to the Director tool. 

This leaves just two ways to change a Parameter Set default - modify the Parameter Set object or modify the underlying file.

If you modify the Parameter Set file directly you open it up to manual mistakes and you cannot change encrypted values.  Putting passwords into a file without any encryption is a big security no no.  Not only will they be exposed in this file but they may turn up in DataStage Director log messages.  So even if you did keep this file secure, which is difficult to do since DataStage needs read access to it, you can still expose it in log messages that almost anyone can get access to.

If you modify the Parameter Set values from the DataStage Designer you can use DataStage encryption to protect passwords and use the grid for data entry.  It's a technically safe way to do it but it means giving the Designer tool to your production support team which is overkill.  They need an operations tool not a complex designer tool.

What we need in DataStage 8.1 is a way to change Parameter default values and encrypted values via the Director tool and/or via the Information Server console.  So a member of a support team who doesn't really know much about DataStage can follow a set of instructions to log into a DataStage support tool to change a database password on a regular interval.

Parameter Set Ideas

I covered Job Parameter Ideas in a previous post about some of the uses of normal job parameters.  Now it's time to beef this up and organise parameters into parameter sets.

Here are some job parameters that should not be put in parameter sets because you want to set them from a Sequence job or hard coded to a default value in a parallel job and not pulled from a central parameter set file:

  • Process Id
  • Process Date / Business Date / Transaction Date
  • Last Surrogate Key Used
  • Source System ID
  • File Name

These are the Parameter Set groupings where every parameter set has a prefix of PS_ to make them easier to find in repository searches, what you can do in Parameter Sets is overload a job with parameters, adding parameters the job doesn't necessarily need to cut back the number of Parameter Sets that need to be found.  The overloading doesn't hurt as you don't need to set every single parameter value in the Sequence job any more:

  • PS_DB_????_CONNECT
    • DB_????_SERVER
    • DB_????_NAME
    • DB_????_USERID
    • DB_????_PASSWORD
    • When you have a database parameter set you have some type of database identifier as a prefix (eg 4 characters) that you use in every parameter set name and every parameter in that set.  That makes the parameter names unique and easier to use, especially when a job connects to several databases.
    • Put all your directories into one Parameter Set to make them easier to find.
    • All constants used in a slowly changing dimension transformer stage.
    • The values that get put into blank fields in a customer record, where ever it gets created.
    • DB_????_LOOKUP_STAFF
    • Let's say you have lookups that are used in a lot of different jobs but the tables have names that are hard to remember.  A Parameter Set can hold the name of every lookup table with a default value and hint text that tells you what the table is called and how to use it.  A Parameter Set then becomes a type of technical glossary.

In my next Parameter Set Series post I'll look at Environment Variables in Parameter Sets with a lot of new ideas for grouping parameters.


QUESTION NO: 362 The NLS_SORT parameter sets the default sort method for which of the following ope...
  • xuejiayue1105
  • xuejiayue1105
  • 2015年10月13日 13:43
  • 1240

Name for parameter binding must not be null or empty! On JDKs < 8, you need to use @Param for named

SpringBoor加 jpa,工作当中出现了如下的错误,2017-11-23 15:55:37.600 ERROR 11188 --- [nio-8080-exec-1] o.a.c.c.C.[.[...
  • xusheng_Mr
  • xusheng_Mr
  • 2017年11月23日 16:04
  • 759

sequence parameter sets(转)

H.264 中定义的sequence parameter sets中包括了一个图像序列的所有信息.它是H.264的基础之一,是编码前进行初始化的关键的一环,本文通过参考H.264的标准文档,对每个函数...
  • android_lee
  • android_lee
  • 2011年02月22日 16:03
  • 1695

How to Create and Use the DLL

 Q1. How to create a dll file in .net? S1: When you create a new project, you must choose the proje...
  • chiefsailor
  • chiefsailor
  • 2007年04月18日 00:59
  • 492


本地cmd能够ping同虚拟机的IP地址,但是xshell链接时提示如下: Connecting to Could not connect to...
  • KoalaY_Doctor
  • KoalaY_Doctor
  • 2016年01月05日 10:34
  • 2642

python连接mysql时 出现DeprecationWarning: the sets module is deprecated

以下修改部分为转载,不过在我实际使用过程中,总结亮点 1.以下提到的修改部分为 mySQLdb的源码安装包里面的 2.基于以上,所以在修改之后需要重新编译安装,才能生效  ...
  • pzhu_ye
  • pzhu_ye
  • 2013年11月19日 13:15
  • 2503

H.264 sequence parameter sets成员值含义学习笔记

sequence_parameter_set_rbsp_t结构体内成员及用途:1. unsigned profile_idc:它指的是码流对应的profile.1.1 基线profile(Baseli...
  • jasonme
  • jasonme
  • 2005年03月28日 00:48
  • 2177

Applications Programming in Smalltalk-80(TM):How to use Model-View-Controller (MVC)

Applications Programming in Smalltalk-80(TM):How to use Model-View-Controller (MVC) bySteve Bu...
  • mali1
  • mali1
  • 2004年07月13日 08:19
  • 1087

IBM Information Server(DataStage8.1)安装

IBM Information Server(DataStage)安装注:抱歉现在不能上传图片,CSDN啥时候才能传图片呢一、      安装条件——系统需求. 二、      安装步骤. 1.   ...
  • zzphapy
  • zzphapy
  • 2010年02月05日 16:59
  • 5450

Positional parameter are considered deprecated; use named parameters or JPA-style positional parame

hibernate 4.1之后对于HQL中查询参数的占位符做了改进,如果仍然用老式的占位符会有类似如下的告警信息: [main] WARN  [org.hibernate.hql.inter...
  • steveguoshao
  • steveguoshao
  • 2014年04月24日 14:53
  • 9041
您举报文章:How to Create, Use and Maintain DataStage 8 Parameter Sets