LIBSVM 的参数选择

最新推荐文章于 2018-11-13 20:48:50 发布

zsj0577

最新推荐文章于 2018-11-13 20:48:50 发布

阅读量363

点赞数

文章标签： svm libsvm 参数选择

http://www.cnblogs.com/zhangchaoyang/articles/2189606.html

参数总览：

Usage: svm-train [options] training_set_file [model_file]
options:
-s svm_type : set type of SVM (default 0)
0 -- C-SVC (multi-class classification)
1 -- nu-SVC (multi-class classification)
2 -- one-class SVM
3 -- epsilon-SVR (regression)
4 -- nu-SVR (regression)
-t kernel_type : set type of kernel function (default 2)
0 -- linear: u'*v
1 -- polynomial: (gamma*u'*v + coef0)^degree
2 -- radial basis function: exp(-gamma*|u-v|^2)
3 -- sigmoid: tanh(gamma*u'*v + coef0)
4 -- precomputed kernel (kernel values in training_set_file)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features) num_features是输入向量的个数
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100) 使用多少内存
-e epsilon : set tolerance of termination criterion (default 0.001) 容忍度
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1) 是否减少迭代次数
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1) 当各类数量不均衡时为每个类分别指定C
-v n: n-fold cross validation mode 交叉验证时分为多少组
-q : quiet mode (no outputs)

训练信息输出（例）

optimization finished, #iter = 34
nu = 0.200000
obj = -2.873495 obj是对SVM问题的优化目标函数的值, rho = 0.872785 判决函数的常数项b
nSV = 20 是支持向量的个数，, nBSV = 2 是边界支持向量的个数(i.e., alpha_i = C)

如果训练时间过长，你可能需要：

1.指定更大有cache size。（-m）

2.使用更宽松的stopping tolerance。（-e）

当使用一个很大有-e时，你可能需要检查一下-h 0 (no shrinking) or -h 1 (shrinking)哪个更快。

3.如果上面的方法还不行就需要裁剪训练集。使用tools目录下的subset.py来随机获得训练集的一个子集。

Usage: subset.py [options] dataset number [output1] [output2]

This script selects a subset of the given data set.

options:

-s method : method of selection (default 0)

0 -- stratified selection (classification only)

1 -- random selection

output1 : the subset (optional)

output2 : the rest of data (optional)

If output1 is omitted, the subset will be printed on the screen.

当迭代次数很高时使用shrinking是有帮助的，而当使用一个很大的-e时，迭代次数会减少，最好把shrinking关掉。

当指定一个很大-m时Linux会报"段错误“，很可能是内存溢出了。对于32位的机子最大的可编址内存是4G。同时Linux系统按照3：1来划分用户空间：核空间，所以用户空间只有最大只有3G，而可动态分配的内存最大只有2G。当你使用一个接近2G的-m时内存就会耗尽。

zsj0577

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
LIBSVM 的参数选择

http://www.cnblogs.com/zhangchaoyang/articles/2189606.html参数总览：Usage: svm-train [options] training_set_file [model_file]options:-s svm_type : set type of SVM (default 0)0 -- C-SVC (multi-c
复制链接

扫一扫