- If you are sure the categorical attribute is actually ordinal, then just treat it as numerical attribute.
- If not, use some coding trick to turn it into numerical attribute. According to the suggestion by the author of libsvm, one can simply use 1-of-K coding. For instance, suppose a 1-dimensional category attribute taking value from {A,B,C} . Just turn it into 3-dimensional numbers such that A=(1,0,0) , B=(0,1,0) , C=(0,0,1) . Of course, this will incur significantly additional dimensions in your problem, but I think that is not a serious problem for modern SVM solver (no matter Linear type or Kernel type you adopt).
原文链接:http://stats.stackexchange.com/questions/52915/how-to-deal-with-an-svm-with-categorical-attributes
Q:
I have a space of 35 dimensions (attributes). My analytic problem is a simple classification one.
Out of 35 dimensions, more than 25 are categorical and each attribute takes more than 50+ types of values.
In that scenario, introducing a dummy variable also will not work for me.
How can I run an SVM on a space which has a lot of categorical attributes?
A:
|
|