concept
lift curve describes a performance coefficient (the lift) over the cumulative proportion of a population.
A lift curve is a way of visualizing the performance of a classification model.
in machine lea
A cumulative gains chart shows the total number of events captured by a model over a given number of samples.
A lift curve shows the ratio of a model to a random guess (‘model cumulative sum’ / ‘random guess’ from above).
The Lift curve in Machine Learning, just like all other evaluation metrics is not an unique or perfect solution, however, like a ROC curve, it provides a quick way to get an estimate of how our algorithm is doing and a good tool to compare different models.
To compare two classification models with lift curve, you can use maximum lift value as a metric. Also, the longer the flat zone at the beginning of the curve is the more reliable the model is.
计算公式
算法步骤
- 计算 Lift 的分母Average Rate: 真实值中1的个数除以数据集总数;
- 按预测值(一般为一个概率值)将 <真实值,预测值> 的这个DataFrame排序;
- 遍历2排序好的 DataFrame 依次计算到目前 item 为止的 Lift(公式如上个部分的计算公式所示)
- 做图
代码
- 裸写版本:
https://gist.github.com/jaimezorno/00036a53661e441c600e7ca29c995d29#file-plot_lift_curve-py - scikit-plot包的版本:
https://github.com/reiinakano/scikit-plot/blob/26007fbf9f05e915bd0f6acb86850b01b00944cf/scikitplot/metrics.py
reference
Quora: What is Lift curve?
Understanding Lift curve: A brief introduction to lift curve usage in marketing and machine learning
Cumulative Gains and Lift Charts
绘制curve lift的python包:scikit-plot
the-lift-curve-in-machine-learning
练习地址