1. SMOTE
smote属于过采样,但过采样过程中采用了KNN,具体的算法实现流程如下图。
具体代码实现如下:
参考链接:https://blog.csdn.net/panda_zjd/article/details/79200493
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
from sklearn.neighbors import NearestNeighbors
import numpy as np
import warnings
from sklearn.datasets import load_iris
warnings.filterwarnings("ignore")
class Smote(object):
"""data为少数类"""
# N 为采样原样品的N%, 若N > 100, 则N/100的整数部分直接全拿原始data,N%100余数部分随机从原始data中挑,这样构成初始的sample集合
def __init__(self,data,N=100,k=5):
self.data = data
self.N = N
self.k = k+1
self.n_attrs = data.shape[1]
def oversample