A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data
我们的实验结果使用基于歧视的归纳方案,表明问题并不仅仅是由类别不平衡引起的,而且也与类别之间的数据重叠程度有关
我们得出结论,过采样方法能够帮助分类器的诱导比那些从采样数据集中诱导的分类器更为准确。
值得注意的是,随机过采样是一种非常简单的过采样方法,对于更复杂的过采样方法是非常有竞争力的
Toward intelligent training of supervised image classifications: directing training data acquisition for SVM classification
对于基于SVM的分类,智能培训的潜力特别明显,因为该过程基于这样的概念,即只有对类别边界进行培训的样本才是歧视所必需的。
This paper aims to evaluate the potential to target training data collection to regions that may contain useful training samples at the expense of those that will contribute insignificantly to classification by a SVM. The approach is based on the use of knowledge on the variables that influence the spectral response of the classes.