Python：SMOTE算法——样本不均衡时候生成新样本的算法

最新推荐文章于 2024-03-04 20:46:55 发布

djph26741

最新推荐文章于 2024-03-04 20:46:55 发布

阅读量2.9k

点赞数 1

文章标签：人工智能 python r语言

原文链接：http://www.cnblogs.com/bonelee/p/8535045.html

版权

本文介绍了SMOTE（Synthetic minority over-sampling technique）算法，用于解决样本不均衡问题。SMOTE通过生成合成的少数类样本来平衡数据集。文章详细解释了算法原理，并提供了Python实现的简单示例，探讨了与其他处理不均衡数据方法的对比，以及如何结合Tomek link进行集成数据扩充。

摘要由CSDN通过智能技术生成

Python：SMOTE算法

直接用python的库，

imbalanced-learn

imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-class imbalance. It is compatible with scikit-learn and is part of scikit-learn-contrib projects.

---------------------

http://contrib.scikit-learn.org/imbalanced-learn/stable/auto_examples/over-sampling/plot_smote.html#sphx-glr-auto-examples-over-sampling-plot-smote-py

http://contrib.scikit-learn.org/imbalanced-learn/stable/over_sampling.html#from-random-over-sampling-to-smote-and-adasyn 入门

SMOTE

An illustration of the SMOTE method and its variant.

../../_images/sphx_glr_plot_smote_001.png

 
   # Authors: Fernando Nogueira
#          Christos Aridas
#          Guillaume Lemaitre <g.lemaitre58@gmail.com>
# License: MIT

import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.decomposition import PCA from imblearn.over_sampling import SMOTE print(__doc__) def plot_resampling(ax, X, y, title): c0 = ax.scatter(X[y == 0, 0], X[y == 0, 1], label="Class #0", alpha=0.5) c1 = ax.scatter(X[y == 1, 0], X[y == 1, 1], label="Class #1", alpha=0.5) ax.set_title(title) ax.spines['top'].set_visible(False) ax.spines['right'<