Supervised Term Weighting for Automated Text Categorization——5. Conclusion 总结

本文介绍了监督词加权(STW)方法,该方法针对涉及监督学习的IR任务,如文本分类和过滤。STW通过分析正负训练样本中词的分布差异进行词权重计算,并提出用基于类别的术语评估函数取代idf,实现高效复用已计算的分数,提高处理效率。
摘要由CSDN通过智能技术生成

“We have proposed supervised term weighting (STW), a term weighting methodology specifically designed for IR applications involving supervised learning, such as text categorization and text filtering. Supervised term indexing leverages on the training data by weighting a term according to how different its distribution is in the positive and negative training examples. We have also proposed that this should take the form of replacing idf by the category-based term evaluation function that has previously been used in the term selection phase; as such, STW is also efficient, since it reuses for weighting purposes the scores already computed for term selection purposes.”
我们提出了监督词加权(STW),一种专门为涉及监督学习的IR应用设计的词加权方法,例如文本分类和文本过滤。监督术语索引利用训练数据,根据一个术语在正负训练样本中的分布不同而对其进行加权。我们还提出,应该采用以前在词选择阶段使用的基于类别的术语评价功能取代idf的形式;同时,STW也是高效的,因为它重用了词选择阶段已经计算出来的分数用于词加权阶段。

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

淘淘图兔兔呀

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值