Machine Learning Project on Imbalanced Data(不平衡类的处理)
本文是阅读上述原文后,自己做的缩略版版笔记,希望对大家有所帮助。
这篇文章看似是专门讲不平衡分类问题,但是实际上里面的步骤对于一个机器学习项目都是适用的,只不过在某些过程的处理上,有一些特殊的trick而已。
Imbalanced data Examples
- fraud(诈骗) detection
- cancer detection
- manufacuring defects
- online ads conversion
Problem Statement and Hypothesis Generation
- Binary classification problem
Generating hypothesis. This step should be practiced before looking at the data. This is done to think broadly and not to be constrained by what is acailable. In this step,we’ll create a laundry li