svm训练完保存权重,如何使用线性SVM权重进行特征选择

最新推荐文章于 2024-04-15 09:36:02 发布

博boob博

最新推荐文章于 2024-04-15 09:36:02 发布

阅读量848

点赞数

文章标签： svm训练完保存权重

I have built a SVM linear model for two types of classes (1 and 0), using the following code:

class1.svm.model

and I have extracted the weights for the training set using the following code:

#extract the weights and constant from the SVM model:

I get weights for each feature like the following example:

X2 0.001710949

X3 -0.002717934

X4 -0.001118897

X5 0.009280056

X993 -0.000256577

X1118 0

X1452 0.004280963

X2673 0.002971335

X4013 -0.004369505

Now how do I perform feature selection based on the weights extracted for each feature? how shall I build a weight matrix?

I read papers but the concept is yet not clear to me, Please help!

解决方案

I've dashed this answer off rather quickly, so I expect there will be quite a few points that others can expand on, but as something to get you started...

There are a number of ways of doing this, but the first thing to tackle is to convert the linear weights into a measure of how important each feature is to the classification. This is a relatively simple three step process:

Normalise the input data such that each feature has mean = 0 and standard deviation = 1.

Train your model

Take the absolute value of the weights. That is, if the weight is -0.57, take 0.57.

Optionally you can generate a more robust measure of feature importance by repeating the above several times on different sets of training data which you have created by randomly re-sampling your original training data.

Now that you have a way to determine how important each feature is to the classification, you can use this in a number of different ways to select which features to include in your final model. I will give an example of Recursive Feature Elimination, since it is one of my favourites, but you may want to look into iterative feature selection, or noise perturbation.

So, to perform recursive feature elimination:

Start by training a model on the entire set of features, and calculate it's feature importances.

Discard the feature with the smallest importance value, and re-train the model on the remaining features

Repeat 2 until you have a small enough set of features[1].

[1] where a small enough set of features is determined by the point at which the accuracy begins to suffer when you apply your model to a validation set. On which note: when doing this sort of method of feature selection, make sure that you have not only a separate training and test set, but also a validation set for use in choosing how many features to keep.

博boob博

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
svm训练完保存权重,如何使用线性SVM权重进行特征选择

I have built a SVM linear model for two types of classes (1 and 0), using the following code:class1.svm.model and I have extracted the weights for the training set using the following code:#extract th...
复制链接

扫一扫