数据挖掘与matlab,Matlab机器学习和数据挖掘工具箱 Spider

首先这个工具箱需要jre和weka

=====================================================================================================================================

一 spider主页http://www.kyb.mpg.de/bs/people/spider/(也可以在google上搜索spider matlab得到),关于它的介绍可以参考网址资料

二 使用时为matlab+spider+Weka;因为spider中的一些算法引用了Weka,比如j48

安装注意:

1 matlab7(R14)

6.5版本对java的支持不够,还没有开发javaclasspath等函数

Undefined function or variable 'javaclasspath'.

Undefined function or variable 'javaaddclasspath'.

2 jre1.4.2

matlab7自带的是1.4.2;matlab6自带的是1.3.可以在D:\MATLAB7\sys\java\jre\win32下看到。如果装了matlab7,使用它自带的1.4.2就可以了,尤其不要使用1.6,因为1.6太新了,matlab还不支持。可以在Matlab下使用 version -java查看JVM版本。

如果你想使用1.5的话,C:\Program Files\Java\jre1.5.0_10;把jre1.5.0_10这个文件夹拷贝到D:\MATLAB7\sys\java\jre\win32下,然后增加环境变量MATLAB_JAVA:D:\MATLAB7\sys\java\jre\win32\jre1.5.0_10。这一步如果有问题的话,重启Matlab会给出错误提示。找不到什么什么文件...

3 Weka3.4.10

使用weka版本低一些即可,高的不行,因为高版本的weka可能是用高版本的jvm支持的。

我使用的组合是 matlab7(R14)+jre1.4.2(matlab7自带的,不需要任何设置)+Weka3.4.10

三 使用方法

1 下载spider,有core和extra两个压缩包,把他们解压到同一个文件夹spider下面,然后放到$matlabroot\toolbox下面

2下载weka3.4.10,找到weka.jar放到$matlabroot\java\jar下面

3 启动Matlab打开$matlabroot\toolbox\spider\use_spider.m运行

提示spider的一些信息和 WEKA support enabled!表示成功了。

然后可以使用 help spider命令查看信息,他的功能列出如附录,然后就可以训练了。

四 一个简单的例子

X=rand(50)-0.5; Y=sign(sum(X,2));

dtrain=data(X,Y);

%生成训练集,也可以使用load()从文件读取

model=train(svm,dtrain));

%使用函数train()训练模型

rtest=test(dtest,model);

%使用训练好的模型对验证集dtest测试,返回测试结果

五 附录spider信息

最新spider Version 1.71 (24/7/2006)

Basic library objects.

data         - Storing input data and output results

data_global  - Implementation of data object that limits memory overhead

algorithm    - Generic algorithm object

group        - Groups sets of objects together (algorithms or data)

loss         - Evaluates loss functions

get_mean     - Takes mean loss over groups of algs

chain        - Builds chains of objects: output of one to input of another

param        - To train and test different hyperparameters of an object

cv           - Cross validation using objects given data

kernel       - Evaluates and caches kernel functions

distance     - Evaluates and caches distance functions

Statistical Tests objects.

wilcoxon     - Wilcoxon test of statistical significance of results

corrt_test   - Corrected resampled t-test - for dependent trials

Dataset objects.

spiral       - Spiral dataset generator.

toy          - Generator of dataset with only a few relevant features

toy2d        - Simple 2d Gaussian problem generator

toyreg       - Linear Regression with o outputs and n inputs

Pre-Processing objects

normalize    - Simple normalization of data

map          - General user specified mapping function of data

Density Estimation objects.

parzen       - Parzen's windows kernel density estimator

indep        - Density estimator which assumes feature independence

bayes        - Classifer based on density estimation for each class

gauss        - Normal distribution density estimator

Pattern Recognition objects.

svm          - Support Vector Machine (svm)

c45          - C4.5 for binary or multi-class

knn          - k-nearest neighbours

platt        - Conditional Probability estimation for margin classifiers

mksvm        - Multi-Kernel LP-SVM

anorm        - Minimize the a-norm in alpha space using kernels

lgcz         - Local and Global Consistent Learner

bagging      - Bagging Classifier

adaboost     - ADABoost method

hmm          - Hidden Markov Model

loom         - Leave One Out Machine

l1           - Minimize l1 norm of w for a linear separator

kde          - Kernel Dependency Estimation: general input/output machine

dualperceptron       - Kernel Perceptron

ord_reg_perceptron   - Ordinal Regression Perceptron (Shen et al.)

splitting_perceptron - Splitting Perceptron (Shen et al.)

budget_perceptron    - Sparse, online Pereceptron (Crammer et al.)

randomforest - Random Forest Decision Trees         WEKA-Required

j48          - J48 Decision Trees for binary        WEKA-Required

Multi-Class and Multi-label objects.

one_vs_rest  - Voting method of one against the rest (also for multi-label)

one_vs_one   - Voting method of one against one

mc_svm       - Multi-class Support Vector Machine by J.Weston

c45          - C4.5 for binary or multi-class

knn          - k-nearest neighbours

Feature Selection objects.

feat_sel     - Generic object for feature selection + classifier

r2w2_sel     - SVM Bound-based feature selection

rfe          - Recursive Feature Elimination (also for the non-linear case)

l0           - Dual zero-norm minimization (Weston, Elisseeff)

fsv          - Primal zero-norm based feature selection (Mangasarian)

fisher       - Fisher criterion feature selection

mars         - selection algorithm of Friedman (greedy selection)

clustub      - Multi-class feature selection using spectral clustering

mutinf       - Mutual Information for feature selection.

Regression objects.

svr          - Support Vector Regression

gproc        - Gaussian Process Regression

relvm_r      - Relevance vector machine

multi_rr     - (possibly multi-dimensional) ridge regression

mrs          - Multivariate Regression via Stiefel Constraints

knn          - k-nearest neighbours

multi_reg    - meta method for independent multiple output regression

kmp          - kernel matching pursuit

kpls         - kernel partial least squares

lms          - least mean squared regression [now obselete due to multi_rr]

rbfnet       - Radial Basis Function Network (with moving centers)

reptree      - Reduced Error Pruning Tree       WEKA-Required

Model Selection objects.

gridsel      - select parameters from a grid of values

r2w2_sel     - Selecting SVM parameters by generalization bound

bayessel     - Bayessian parameter selection

Unsupervised objects.

one_class_svm - One class SVM

kmeans       - K means clustering

kvq          - Kernel Vector Quantization

kpca         - Kernel Principal Components Analysis

ppca         - Probabilistic Principal Component Analysis

nmf          - Non-negative Matrix factorization

spectral     - Spectral clustering

mrank        - Manifold ranking

ppca         - Probabilistic PCA

Reduced Set and Pre-Image objects.

pmg_mds      - Calculate Pre-Images based on multi-dimensional scaling

pmg_rr       - Calculate Pre-Images based on learning and ridge regression

rsc_burges   - Bottom Up Reduced Set; calculates reduced set based on gradient descent

rsc_fp       - Bottom Up Reduced Set; calculates reduced set for rbf with fixed-point iteration schemes

rsc_mds      - Top Down Reduced Set; calculates reduced set with multi-dimensional scaling

rsc_learn    - Top Down Reduced Set; calculates reduced set with ridge regression

rss_l1       - Reduced Set Selection via L1 penalization

rss_l0       - Reduced Set Selection via L0 penalization

rss_mp       - Reduced Set Selection via matching pursuit

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值