嗨,我正在使用Weka进行机器学习,我的artff文件格式如下
`@relation datastest
@attribute fwoh {what, when, where, how, who, why}
@attribute parameter {color, performance}
@attribute object { power, cost}
@attribute model {x,y,z}
@attribute question String`
我尝试使用J48,PART,DecisionTable,ZeroR和SMO,在构建分类器时,所有分类器都将我置于异常之下.
weka.core.UnsupportedAttributeTypeException: weka.classifiers.rules.ZeroR: Cannot handle string class!
at weka.core.Capabilities.test(Capabilities.java:1164)
at weka.core.Capabilities.test(Capabilities.java:1303)
at weka.core.Capabilities.test(Capabilities.java:1208)
at weka.core.Capabilities.testWithFail(Capabilities.java:1506)
at weka.classifiers.rules.ZeroR.buildClassifier(ZeroR.java:122)
at wekaproject.TextCategorizationTest.main(TextCategorizationTest.java:66)
我建立分类器如下
final Instances data = new Instances(readDataFile("questions.txt"));
final Classifier classifier = new SMO();
classifier.buildClassifier(data );
谁能告诉我应该使用什么分类器.而且我应该使用StringToWordVector.我尝试使用StringToVector,但没有帮助我.谁能告诉我如何使用StringToVector,如果需要的话.
更新:
这是输入的arff文件
@relation 'text_files_in_C:\\Desktop\\test'
@attribute id {a,b,c}
@attribute ids {g,h,i}
@attribute idss {k,l,m}
@attribute contents string
@data
a,g,k,'x'
b,h,l'y'
c,i,m,'z'
这是过滤后的输出arff文件
@relation 'text_files_in_C:\\Desktop\\test-weka.filters.unsupervised.attribute.StringToWordVector-D.,:\\\'\\\"()?!-R4-W1000000-C-