Supervised Term Weighting for Automated Text Categorization——4.2 Learning method 学习方法

最新推荐文章于 2024-09-20 04:58:29 发布

淘淘图兔兔呀

最新推荐文章于 2024-09-20 04:58:29 发布

阅读量125

点赞数

分类专栏：文本表示文章标签：词加权文本表示文本分类自然语言处理机器学习

本文链接：https://blog.csdn.net/qq_33790600/article/details/111998734

版权

文本表示专栏收录该内容

14 篇文章 0 订阅

订阅专栏

“The learning method used for our experiments is a support vector machine (SVM) learner as implemented in the SVM-LIGHT package (version 3.5) [4]. SVMs attempt to learn a hyperplane in |T|-dimensional space that separates the positive training examples from the negative ones with the maximum possible margin, i.e. such that the minimal distance between the hyperplane and a training example is maximum; results in computational learning theory indicate that this tends to minimize the generalization error, i.e. the error of the resulting classifier on yet unseen examples. We have simply opted for the default parameter setting of SVM-LIGHT; in particular, this means that a linear kernel has been used. In an extended version of this paper [2] we also discuss analogous experiments we have carried out with two other learners (a Rocchio method and k-NN algorithm), and with three different reduction factors (.00, .50, .90).”
我们实验中使用的学习方法是支持向量机(SVM)，用SVM-LIGHT包(3.5版)[4]来实现。SVM试图学习|T|维空间中的一个超平面，把积极的训练样本从负的最大可能边界分离出来，即最大化超平面和训练样本的最小距离；计算学习理论表明，这往往导致最小化泛化误差，即生成的分类器在新样本上的误差。我们简单地选择了SVM-LIGHT的默认参数设置；特别地，这意味着使用了线性核。在本文的扩展版本中[2]，我们还讨论了我们在其他两个学习方法(Rocchio方法和k-NN算法)和三个不同的约化因子(.00, .50, .90)下进行的类似实验。

参考资料：
[2] F. Debole and F. Sebastiani. Supervised term weighting for automated text categorization. TechnicalReport 2002-TR-08, Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche,Pisa, IT, 2002. Submitted for publication.
[4] T. Joachims. Making large-scale SVM learning practical. In B. Schölkopf, C. J. Burges, and A. J.Smola, editors, Advances in Kernel Methods -SupportVector Learning, chapter 11, pages 169-184. The MITPress, Cambridge, US, 1999.