How to write SQL Server 2008 Data Mining Plug-in Algorithms

In Microsoft SQL Server analysis services (both 2005 and 2008), you can write plug-in data mining algorithms. Unfortunately, every few resources, articles, sample codes to teach you how to do it. Not many people seems to use it and very little discussion on Web. Creating Plug-in Algorithms for SQL Server 2005 Data Mining is probably the only tutorial I can find.

 

I am writing this article, simply taking notes for the "SQL Server Data Mining Plug-In Viewers Tutorial.doc" inside the sample download.

Download sample code

Download the sample from Creating Plug-in Algorithms for SQL Server 2005 Data Mining from http://sqlserverdatamining.com/ssdm/Default.aspx?tabid=94&Id=163  

Extract the files, "SQL Server Data Mining Plug-In Viewers Tutorial.doc" has the detailed step by step description of how to build that Demo project from an empty ATL project.

You can directly start from CompletedDemo project, which implements a pair-wise linear regression model.

 

I tested the 2005 sample code, it works for SQL Server 2008 as well.

Use the sample

Skip section 6 Building a Sell Plug-In Algorithm, if you use CompletedDemo project. But you still need to configure the model in the analysis service.  

1. Start analysis services in Control Pannel -> Administrative Tools -> MSSQLServerOLAPService -> Start

2. Edit the file C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini, adding <MyCompany_Pairwise_Linear_Regression> section.

 

edit msmdsrv.ini
 
   
1 < ConfigurationSettings >
2 ...
3 < DataMining >
4 ...
5 < Algorithms >
6 ...
7 < MyCompany_Pairwise_Linear_Regression >
8 < ProgID > PlugIn.FACTORY.1 </ ProgID >
9 < Enabled > 1 </ Enabled >
10 </ MyCompany_Pairwise_Linear_Regression >
11

3. Restart MSSQLServerOLAPService.

Note: everytime you recompile the project, you need to stop analysis services first, then re-compile, then start it agaion.


Now you can test the plug-in algorithm.

1. Open Business Intelligence Development Studio, and create a Analysis Services Project.

2. New a data source to connect to Sample.mdb, which is included in the sample you download.

3. New a Data Source View

4. Create a mining strucutre. If you find the demo model appears in the model list, everything is going fine.

You can then use the models as the other SQL Server default models. Details of the following steps please find in "SQL Server Data Mining Plug-In Viewers Tutorial.doc".

 

Customize the sample 

Step 1:  Customize IDMAlgorithmMetadata

ALGORITHM.IDMAlgorithmMetadata.cpp包含30几个函数,包括

-model metadata 的函数

-model parameters的函数

 

主要介绍一下Static Model Parameter Handling:

1. 声明:Static model Parameters are declared in ALGORITHM.cpp file, between BEGIN_PARAMETER_DECLARATION(ALGORITHM) and END_PARAMETER_DECLARATION(ALGORITHM). Here you declare the type, RES ID, defalut value of these parameters.

 

Declare static parameters
 
   
1 BEGIN_PARAMETER_DECLARATION(ALGORITHM)
2 DECLARE_PARAMETER(L " MINIMUM_DEPENDENCY_SCORE " , // Name
3 IDS_MINIMUM_DEPENDENCY_SCORE_DESCR, // Res ID
4 DBTYPE_R4, // Type, as a DBTYPEENUM
5 false , // Required flag
6 true , // Exposed flag
7 0 , // General flags
8 L " 3 " , // Default value, as a string
9 L " (-inf,inf) " ) // Enumeration, as a string
10 DECLARE_PARAMETER(L " DISPLAY_CORRELATION " , // Name
11 IDS_DISPLAY_CORRELATION_DESCR, // Res ID
12 DBTYPE_BOOL, // Type, as a DBTYPEENUM
13 false , // Required flag
14 true , // Exposed flag
15 0 , // General flags
16 L " FALSE " , // Default value, as a string
17 L " TRUE or FALSE " ) // Enumeration, as a string
18 END_PARAMETER_DECLARATION(ALGORITHM)

You may need to define the RES ID in Resource.h, for exaple, #define IDS_MINIMUM_DEPENDENCY_SCORE_DESCR 107.

2. 定义: static parameters functions have been impletemented in ALGORITHM.IDMAlgorithmMetadata.cpp. You need not write them by yourselves.

  1. GetNumParameters
  2. GetParameterName
  3. GetParameterType
  4. GetParameterIsRequired
  5. GetParameterIsExposed
  6. GetParameterFlags
  7. GetParameterDescription
  8. GetParameterDefaultValue
  9. GetParameterValueEnumeration
  10. ParseParameterValue

The only excpetion is ParseParameterValue. If you have parameters with non-numeric types, you will need to modify ParseParameterValue to perform the appropriate parsing, even if you are using the static parameter handling.

 

3. 使用:You load the static parameters in the main model traininng function ALGORITHM::InsertCases in ALGORITHM.IDMAlgorithm.cpp. You can extract the static model parameters by calling _dmhparamhandler.GetParameterValue.

 

4. Res ID 格式转化:需要在DmhLocalization.cpp的LoadStringFromID里面定义每个Res ID对应的string,否则当analysis service调用GetDisplayName或者其他Get函数query某个Res ID时,只能得到“Undefined Localized String”. 

 

Step 2: Customize IDMAlgorithmFactory

Factory.cpp里面主要定义了CreateAlgorithm函数。基本上不需要动原来的定义。

 

Step 3: Customize IDMAlgorithm

Initialze: 使用sample里面的定义,不需要改变。

InsertCases:  主要做三件事 (1)Get model parameters (2) load statistics calculated in LRSSTATREADER . (3) allocate _lrpmodel. 如果定义了更多parameter,可能需要改变(1).

Predict: The code takes the input case, turns it into a dense representation (where the ith element of a vector corresponds to the value for attribute i), and then calls the _lrpmodel.ExtractPosterior() function.

其他的函数就不列举了,我也不是很理解。

 

Note: LRSSTATREADER (lrsstatreader.cpp and lrsstatreader.h): Implement a collection of statistics that will be used in the model.

 

Step 4: Customize IDMPersist

可以暂时不动

Step 5: Customize IDMAlgorithmNavigation

可以暂时不动 

 

 

转载于:https://www.cnblogs.com/xx22/archive/2010/12/24/1915836.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值