“The learning method used for our experiments is a support vector machine (SVM) learner as implemented in the SVM-LIGHT package (version 3.5) [4]. SVMs attempt to learn a hyperplane in |T|-dimensional space that separates the positive training examples from the negative ones with the maximum possible margin, i.e. such that the minimal distance between the hyperplane and a training example is maximum; results in computational learning theory indicate that this tends to minimize the generalization error, i.e. the error of the resulting classifier on yet unseen examples. We have simply opted for the default parameter setting of SVM-LIGHT; in particular, this means that a linear kernel has been used. In an extended version of this paper [2] we also discuss analogous experiments we have carried out with two other learners (a Rocchio method and k-NN algorithm), and with three different reduction factors (.00, .50, .90).”
我们实验中使用的学习方法是支持向量机(SVM),用SVM-LIGHT包(3.5版)[4]来实现。SVM试图学习|T|维空间中的一个超平面,把积极的训练样本从负的最大可能边界分离出来,即最大化超平面和训练样本的最小距离;计算学习理论表明,这往往导致最小化泛化误差,即生成的分类器在新样本上的误差。我们简单地选择了SVM-LIGHT的默认参数设置;特别地,这意味着使用了线性核。在本文的扩展版本中[2],我们还讨论了我们在其他两个学习方法(Rocchio方法和k-NN算法)和三个不同的约化因子(.00, .50, .90)下进行的类似实验。
参考资料:
[2] F. Debole and F. Sebastiani. Supervised term weighting for automated text categorization. TechnicalReport 2002-TR-08, Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche,Pisa, IT, 2002. Submitted for publication.
[4] T. Joachims. Making large-scale SVM learning practical. In B. Schölkopf, C. J. Burges, and A. J.Smola, editors, Advances in Kernel Methods -SupportVector Learning, chapter 11, pages 169-184. The MITPress, Cambridge, US, 1999.