近红外光谱建模之样本集划分python实现（KS算法）

最新推荐文章于 2024-05-25 17:21:03 发布

潘旭阳

最新推荐文章于 2024-05-25 17:21:03 发布

阅读量9.9k

点赞数 7

分类专栏：近红外光谱文章标签：样本集划分近红外光谱

本文链接：https://blog.csdn.net/joseph__lagrange/article/details/95042656

版权

近红外光谱专栏收录该内容

10 篇文章 169 订阅

订阅专栏

一 Kennard-Stone算法原理（KS算法）

KS算法原理：把所有的样本都看作训练集候选样本,依次从中挑选样本进训练集。首先选择欧氏距离最远的两个样本进入训练集，其后通过计算剩下的每一个样品到训练集内每一个已知样品的欧式距离，找到距已选样本最远以及最近的两个样本，并将这两个样本选入训练集，重复上述步骤直到样本数量达到要求。

欧式距离计算公式：

Xp，Xq表示两个不同的样本，N代表样本的光谱波点数量

二 python代码实现

def ks(x, y, test_size=0.2):
    """

    :param x: shape (n_samples, n_features)
    :param y: shape (n_sample, )
    :param test_size: the ratio of test_size (float)
    :return: spec_train: (n_samples, n_features)
             spec_test: (n_samples, n_features)
             target_train: (n_sample, )
             target_test: (n_sample, )
    """
    M = x.shape[0]             
    N = round((1-test_size) * M)
    samples = np.arange(M)     

    D = np.zeros((M, M))       

    for i in range((M-1)):
        xa = x[i, :]
        for j in range((i+1), M):
            xb = x[j, :]
            D[i, j] = np.linalg.norm(xa-xb) 

    maxD = np.max(D, axis=0)             
    index_row = np.argmax(D, axis=0)    
    index_column = np.argmax(maxD)      

    m = np.zeros(N)
    m[0] = np.array(index_row[index_column])
    m[1] = np.array(index_column)
    m = m.astype(int)                   
    dminmax = np.zeros(N)
    dminmax[1] = D[m[0], m[1]]  

    for i in range(2, N):  
        pool = np.delete(samples, m[:i]) 
        dmin = np.zeros((M-i))        
        for j in range((M-i)):        
            indexa = pool[j]         
            d = np.zeros(i)           
            for k in range(i):         
                indexb = m[k]         
                if indexa < indexb:   
                    d[k] = D[indexa, indexb]
                else:
                    d[k] = D[indexb, indexa]
            dmin[j] = np.min(d)    
        dminmax[i] = np.max(dmin)   
        index = np.argmax(dmin)     
        m[i] = pool[index]          

    m_complement = np.delete(np.arange(x.shape[0]), m)    

    spec_train = x[m, :]
    target_train = y[m]
    spec_test = x[m_complement, :]
    target_test = y[m_complement]
    return spec_train, spec_test, target_train, target_test

潘旭阳

关注

7
点赞
踩
105

收藏

觉得还不错? 一键收藏
12
评论
近红外光谱建模之样本集划分python实现（KS算法）

一 Kennard-Stone算法原理（KS算法）KS算法原理：把所有的样本都看作训练集候选样本,依次从中挑选样本进训练集。首先选择欧氏距离最远的两个样本进入训练集，其后通过计算剩下的每一个样品到训练集内每一个已知样品的欧式距离，找到距已选样本最远以及最近的两个样本，并将这两个样本选入训练集，重复上述步骤直到样本数量达到要求。欧式距离计算公式：Xp，Xq表示两个不同的样本，N代表样...
复制链接

扫一扫