MAHAKIL: Diversity Based Oversampling Approach to Alleviate the Class Imbalance Issue in Software Defect Prediction
前言
在做缺陷预测或者是其它分类任务的同时,高度不匹配的数据通常会使任务变得困难,往往采用合成过采样方法通过创建新的少数缺陷模块来平衡类分布来解决这一问题。尽管这些方法取得了成功,但它们大多导致过度泛化。
一、基本信息?
Ebo Bennin, K, Keung, J, Phannachitta, P, Monden, A, & Mensah, S. (2018). Mahakil: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Transactions on Software Engineering, 1-1.
二、文章内容
1.主要问题
The main problem is that common prediction algorithms assume that the
classes in any dataset are equally balanced. Thus, models trained o