【待完善】【表达学习】稀疏表达SRC方法研究

来自Robust Face Recognition via Sparse Representation
注意本博客代码疑似无效,参考需慎重

main content

2.3 Classification Based on Sparse Representation

原文部分实例内容如下:

Example 1( l 1 l^1 l1-minimization versus l 2 l^2 l2-minimization). To illustrate how Algorithm 1 1 1 works, we randomly select half of the 2 , 414 2,414 2,414 images in the Extended Yale B database as the training set and the rest for testing. In this example, we subsample the images from the original 192 × \times × 168 to size 12 × \times × 10. The pixel values of the downsampled image are used as 120 − D 120-D 120D features—stacked as columns of the matrix A A A in the algorithm. Hence, matrix A A A has size 120 × \times × 1,207, and the system y = A x y = Ax y=Ax is underdetermined.
在这里插入图片描述
F i g . 3 a Fig. 3a Fig.3a illustrates the sparse coefficients recovered by Algorithm 1 1 1 for a test image from the first subject. The figure also shows the features and the original images that correspond to the two largest coefficients. The two largest coefficients are both associated with training samples from subject 1 1 1. F i g . 3 b Fig. 3b Fig.3b shows the residuals with respect to the 38 38 38 projected coefficients δ i x ^ 1 , i = 1 , 2 , . . . , 38. \delta_i\hat{x}_1, i =1,2,...,38. δix^1,i=1,2,...,38. With 12 × \times × 10 downsampled images as features, Algorithm 1 1 1 achieves an overall recognition rate of 92.1 92.1 92.1 percent across the Extended Yale B database. (See Section 4 4 4 for details and performance with other features such as Eigenfaces and Fisherfaces, as well as comparison with other methods.) Whereas the more conventional minimum l 2 l^2 l2-norm solution to the underdetermined system y = A x y = Ax y=Ax is typically quite dense, minimizing the l 1 l^1 l1-norm favors sparse solutions and provably recovers the sparsest solution when this solution is sufficiently sparse. To illustrate this contrast, F i g . 4 a Fig. 4a Fig.4a shows the coefficients of the same test image given by the conventional l 2 l^2 l2-minimization ( 4 ) (4) (4), and F i g . 4 b Fig. 4b Fig.4b shows the corresponding residuals with respect to the 38 38 38 subjects. The coefficients are much less sparse than those given by l 1 l^1 l1-minimization (in F i g . 3 Fig. 3 Fig.3), and the dominant coefficients are not associated with subject 1 1 1. As a result, the smallest residual in F i g . 4 Fig. 4 Fig.4 does not correspond to the correct subject (subject 1 1 1).
在这里插入图片描述

算法流程(参考课程设计)

使用SRC算法实现基于投票机制的遮挡人脸图像分类实验,实验包括:

  1. 检查数据集中的数据特征,确定图片分块大小,并将无遮挡的人脸作为训练数据,有遮挡的人脸作为测试数据。
  2. 应用SCR算法进行字典构建并对测试集进行基于分块投票的分类;
  3. 统计分类结果与准确率。

SRC方法概要:

  1. 初始化字典:A. 将训练集中的每一张图像进行降采样(例如降采样到40行30列),并reshape成一列(120行1列),并对该列进行归一化。B. 将训练图像依次处理并排列成字典,
    A = [ A 1 , A 2 , A 3 ⋯ A i , ⋯ A k ] A=[A_1, A_2,A_3\cdots A_i,\cdots A_k] A=[A1,A2,A3Ai,Ak]
    其中 A i A_i Ai是某一个人的特征集合。
    A = [ a i 1 , a i 2 ⋯ A i n ] A=[a_{i1}, a_{i2}\cdots A_{in}] A=[ai1,ai2Ain]
    其中 a i 1 a_{i1} ai1是第 i i i个人的第 1 1 1张图像 r e s h a p e reshape reshape的那一列(120行1列)。
  2. 将测试数据用同样的参数降采样并 r e s h a p e reshape reshape得到特征向量。并用 O M P OMP OMP算法计算该测试数据的稀疏表达x;
  3. 使用类似 o n e − h o t one-hot onehot方法对 x x x进行处理。
  4. 应用字典将处理后的稀疏表达还原,并计算原后的向量和图像原始特征向量的距离
  5. 对所有类别均用2.3、2.4的方法计算距离。距离最小的类,即为分类结果。

基于分块投票的SRC方法概要:
为了提高分类的准确率,需先将图片分块,每块单独进行SRC分类。图片的分类结果由分块分类结果投票决定。

读取数据集

包括AR,YaleB和本人自制的行李分拣数据集

class dataMaker():
    def __init__(self, file_dir, name='AR', use_num=None):
        if name == 'AR':
            if use_num == None:
                self.num_class = 100
            else:
                self.num_class = use_num
            self.train_item = 14
            self.test_item = 12
            self.AR_dataSet(file_dir)
        elif name == 'YaleB':
            if use_num == None:
                self.num_class = 39
            else:
                self.num_class = use_num
            self.train_item = 1
            self.test_item = 1
            self.YaleB_dataSet(file_dir)
        elif name == 'classed_pack':
            if use_num == None:
                self.num_class = 10
            else:
                self.num_class = use_num
                # 每类图片有三张(三个摄像头)取两个做训练集,一个做测试集
            self.train_item = 2
            self.test_item = 1

            self.classed_pack_dataset(file_dir)
            
    
    def AR_dataSet(self, file_dir):
        ''' 提取文件夹下的地址+文件名,源文件设定排序规则 '''
        train_file = []
        test_file = []
        for root, dirs, files in os.walk(file_dir):
            for file in files:
                f_name = file.split('-')
                id = f_name[2].split('.')
                id = int(id[0])
                if id <= 7 or (id >= 14 and id <= 20) :
                    train_file.append(os.path.join(root, file))
                else:
                    test_file.append(os.path.join(root, file))
        
        train_data = []
        test_data = []
        print('prepare file name...',end=' ')
        for i in train_file:
            img = Image.open(i)
            train_data.append(np.array(img))
        print('read in train data...',end=' ')
        for i in test_file:
            img = Image.open(i)
            test_data.append(np.array(img))
        print('read in test data...', end='\n')
        
        self.train_data = train_data        # 14*100
        self.test_data = test_data  # 12*100

    def YaleB_dataSet(self, file_dir):
        '''
        读取YaleB数据集, 同时作为训练集和测试集
        TODO:好吧实际上YaleB数据集每类只有一张图片所以在SRC算法中无用
        '''
        src_img_w = 192
        src_img_h = 168

        # dataset = np.zeros((38,192,168), np.float)
        dataset = np.zeros((src_img_w * src_img_h, 38), np.float)
        cnt_num = 0
        img_list = sorted(os.listdir(file_dir))
        os.chdir(file_dir)

        self.train_data = []
        self.test_data = []
        for img in img_list:
            if img.endswith(".pgm"):
                # print(img.size)
                gray_img = cv2.imread(img, cv2.IMREAD_GRAYSCALE)
                # gray_img = cv2.resize(gray_img, (src_img_w, src_img_h),interpolation=cv2.INTER_AREA)
                # dataset[:, cnt_num] = gray_img.reshape(src_img_w * src_img_h, )
                cnt_num += 1
                self.train_data.append(gray_img)
                self.test_data.append(gray_img)
        print('...prepare train data finished')


    def classed_pack_dataset(self, file_dir):
        '''
        大创行李数据集,该数据集为'三个轨道的图像分别有两个做训练集,剩下的一个做测试集'
        TODO:考虑使用transforms
        '''
        # TODO:预定义哪个做测试集
        test_class = [3,]

        self.train_data = []
        self.test_data = []

        # 根据传入行李类别数判断读取.jpg文件
        for i in range(1, self.num_class + 1):
            temp = []
            sub_floder = file_dir + str(i) + '/'
            # 每件行李有三张图片
            for j in range(1, self.train_item + self.test_item + 1):
                # temp.append(sub_floder + str(j) + '.jpg')
                img_dir = sub_floder + str(j) + '.jpg'
                # TODO:直接以灰度图方式读出
                img = cv2.imread(img_dir, cv2.IMREAD_GRAYSCALE)
                if j in test_class:
                    self.test_data.append(img)
                else:
                    self.train_data.append(img)
        
        print('...prepare train data finished')

SRC算法

算法实现

再次说明本代码有瑕疵orz

class SRC():
    def __init__(self, dataset, max_iter=100, tol=1e-6, n_nonzero_coefs=None):
        '''
        1.初始化字典
        '''
        # (120,165) -> (120, 160) -> (30, 40)*16 -> (1200,)*16
        self.max_iter = max_iter
        self.tol = tol
        self.n_nonzero_coefs = n_nonzero_coefs

        self.train_data = dataset.train_data
        self.test_data = dataset.test_data

        self.num_class = dataset.num_class
        self.train_item = dataset.train_item
        self.test_item = dataset.test_item
    
    def makeDictionary(self, newShape, block):
        '''
        newShape为降采样后的图片大小,block为对降采样后的图片分别在行、列分为多少块

        A. 将训练集图像进行降采样eg.(40,30), 并reshape成一列(120行1列),并对该列进行归一化。 
        B. 将训练图像依次处理并排列成字典,
         1.1 其中Ai 是某一个人的特征集合。 1.2 其中"a" _i1是第i个人的第1张图像reshape的那一列(120行1列)。
        C. 将测试数据用同样的参数降采样并reshape得到特征向量。
        '''
        self.divide = Divide(int(newShape[1] / block[1]),int(newShape[0] / block[0]))
        self.div_num = block[0] * block[1]
        num_row = int(newShape[0]*newShape[1]/self.div_num)

        # 处理训练集(字典)
        print('SRC->init: train_data', end='')
        self.dictionary = np.zeros((num_row, self.num_class*self.train_item*self.div_num))
        for ind, i in enumerate(self.train_data):
            # 先重新降采样,规划图片大小
            img = cv2.resize(i, newShape, interpolation=cv2.INTER_CUBIC)
            # 按参数分块
            # plt.imshow(img),plt.show()

            res = self.divide.encode(img)
            # self.dictionary = np.column_stack((self.dictionary, res))
            self.dictionary[:,self.div_num*ind:self.div_num*(ind+1)] = res
            if ind % 20 == 0:
                print('.', end='')
        
        # 处理测试集(test)
        print('\nSRC->init: test_data', end='')
        self.test_img = np.zeros((num_row, self.num_class*self.test_item*self.div_num))
        for ind,i in enumerate(self.test_data):
            img = cv2.resize(i, newShape, interpolation=cv2.INTER_CUBIC)
            res = self.divide.encode(img)
            # plt.imshow(self.divide.decode(res,newShape[1],newShape[0])),plt.show()

            # self.test_img = np.column_stack((self.test_img, res))
            self.test_img[:,self.div_num*ind:self.div_num*(ind+1)] = res
            if ind % 20 == 0:
                print('.',end='')
        print('')

        # Normalize the columns of A to have unit l2-norm
        print('dictionary', self.dictionary.shape, 'test_img', self.test_img.shape)
        
        # plt.imshow(self.dictionary), plt.show()
        # plt.imshow(self.test_img), plt.show()

        self.dictionary = self.l2_normalize(self.dictionary)
        self.test_img = self.l2_normalize(self.test_img)

        # plt.imshow(self.dictionary), plt.show()
        # plt.imshow(self.test_img), plt.show()
        print()

    def l2_normalize(self, x, axis=-1, order=2):
        l2 = np.linalg.norm(x, ord = order, axis=axis, keepdims=True)
        l2[l2==0] = 1
        return x/l2

    def dict_update(self, y, d, x, n_components):
        """
        使用KSVD更新字典的过程
        """
        for i in range(n_components):
            index = np.nonzero(x[i, :])[0]
            if len(index) == 0:
                continue
            # 更新第i列
            d[:, i] = 0
            # 计算误差矩阵
            r = (y - np.dot(d, x))[:, index]
            # 利用svd的方法,来求解更新字典和稀疏系数矩阵
            u, s, v = np.linalg.svd(r, full_matrices=False)
            # 使用左奇异矩阵的第0列更新字典
            d[:, i] = u[:, 0]
            # 使用第0个奇异值和右奇异矩阵的第0行的乘积更新稀疏系数矩阵
            for j,k in enumerate(index):
                x[i, k] = s[0] * v[0, j]
        return d, x

    def OMP(self, y):
        '''
        2.用OMP算法计算该测试数据的稀疏表达x;
        '''
        nrows = 3
        ncols = 4
        figsize = (8, 8)
        # _, figs = plt.subplots(nrows, ncols, figsize=figsize)
        # l = []

        # 共self.num_class类,每类图片测试集有...张图片
        for i in range(self.test_item):
            yy = y[:, i * self.div_num : (i + 1) * self.div_num]
            
            # 求解稀疏表达 ATTENTION: 这个写错了
            n_comp = self.num_class * self.train_item * self.div_num
            # max_iter = 10
            dic = copy.deepcopy(self.dictionary)

            for j in range(self.max_iter):
                # 稀疏编码
                x = linear_model.orthogonal_mp(dic, y)
                if len(x.shape) == 1:
                    x = x[:, np.newaxis]
                e = np.linalg.norm(y - np.dot(dic, x))
                print('dict_update->e:',e)
                if e < self.tol:
                    break
                dic, temp_x = self.dict_update(y, dic, x, n_comp)
                print('dict_update->dic.e:', np.linalg.norm(self.dictionary - dic))
                # print(x.transpose())
                # print(temp_x.transpose())

            xx = linear_model.orthogonal_mp(dic, yy)
            # xx = linear_model.orthogonal_mp(self.dictionary, yy)
            if len(xx.shape) == 1:
                xx = xx[:, np.newaxis]

            print(xx.transpose())
            # _, figs1 = plt.subplots(5, 5, figsize=figsize)
            for i in range(0, self.div_num):
                # TODO: 原来xx[]为0时log为-inf
                xx[xx==0] = self.tol
                t_y = np.log(xx[:,i])
                # t_y = xx[:,i]
                # t_x = list(range(35000))
                t_x = list(range(self.train_item*self.num_class*self.div_num))
                # figs1[i][j].bar(t_x,t_y)

                # 这个占用内存太大了似乎出不来啊
                plt.subplot(1, 4, 1), plt.imshow(self.divide.decode(yy, 50, 60))
                plt.subplot(1, 4, 2), plt.bar(t_x, t_y)
                plt.subplot(1, 4, 3), plt.imshow(self.divide.decode(self.dictionary[:, 15 * self.div_num : (15 + 1) * self.div_num], 50, 60))
                plt.subplot(1, 4, 4), plt.imshow(self.divide.decode(np.dot(dic, x),50,60))
                plt.show()
            # plt.show()

            l = []
            print("[OMP]->i:{}: ".format(i))
            for j in range(self.num_class):
                # 取每类的字典部分和每类的稀疏表达部分计算误差
                dd = self.dictionary[:, j * self.train_item * self.div_num : (j + 1) * self.train_item * self.div_num]
                xxx = xx[j * self.train_item * self.div_num : (j + 1) * self.train_item * self.div_num, :]
                e = np.linalg.norm(yy - np.dot(dd, xxx))
                # print("[OMP]->i:{},j:{}->e:{}".format(i, j, e))
                print("\tj:{}->e:{}".format(j, e),end='')
                if e == 0.0:
                    e = self.tol
                l.append(math.log(e))
            print()

            # figs[int(i/4)][i%4].bar(list(range(100)), l)
            # figs[i][j].axes.get_xaxis().set_visible(False)
            # figs[i][j].axes.get_yaxis().set_visible(False)
            
            plt.bar(list(range(self.num_class)), l)
            plt.show()
        
        # plt.show()
            
    
    def run(self):
        # 2.用OMP算法计算该测试数据的稀疏表达x;
        for i in range(self.num_class):
            self.OMP(self.test_img[:, i * self.div_num * self.test_item : (i + 1) * self.div_num * self.test_item])
        print('size dic:{},test:{}'.format(self.dictionary.shape, self.test_img.shape))
        # 3.使用类似one-hot方法对x进行处理。
        # 4.应用字典将处理后的稀疏表达还原,并计算原后的向量和图像原始特征向量的距离
        # train: 100类*14张*16块*1200 test: 100*12*16*1200

        # 5.对所有类别均用3、4的方法计算距离。距离最小的类,即为分类结果。

分析

稀疏表达

先完成一个用随机字典对随机矩阵的表达的实验

import numpy as np
from sklearn import linear_model

def dict_update(y, d, x, n_components):
    """
    使用KSVD更新字典的过程
    """
    for i in range(n_components):
        index = np.nonzero(x[i, :])[0]
        if len(index) == 0:
            continue
        # 更新第i列
        d[:, i] = 0
        # 计算误差矩阵
        r = (y - np.dot(d, x))[:, index]
        # 利用svd的方法,来求解更新字典和稀疏系数矩阵
        u, s, v = np.linalg.svd(r, full_matrices=False)
        # 使用左奇异矩阵的第0列更新字典
        d[:, i] = u[:, 0]
        # 使用第0个奇异值和右奇异矩阵的第0行的乘积更新稀疏系数矩阵
        for j,k in enumerate(index):
            x[i, k] = s[0] * v[0, j]
    return d, x

if __name__ == '__main__':
    # 3x1 = 3x10 * 10x1
    n_comp = 10

    y = np.random.rand(3, 1)+1
    print(y)
    dic = np.random.rand(3, n_comp)
    print(dic)

    xx = linear_model.orthogonal_mp(dic, y)

    max_iter = 10
    dictionary = dic
    tolerance = 1e-6
    for i in range(max_iter):
        # 稀疏编码
        x = linear_model.orthogonal_mp(dictionary, y)
        if len(x.shape) == 1:
            x = x[:, np.newaxis]
        e = np.linalg.norm(y - np.dot(dictionary, x))
        print('e:',e)
        if e < tolerance:
            break
        dictionary,_ = dict_update(y, dictionary, x, n_comp)

    sparsecode = linear_model.orthogonal_mp(dictionary, y)
    print(sparsecode)

实验结果为:

[[1.66100631]
 [1.13394355]
 [1.39439597]]
[[0.46661241 0.36734721 0.20163223 0.26871968 0.21598456 0.05768749 0.72285521 0.05145567 0.05782816 0.55934381]
 [0.06112155 0.145628   0.53625588 0.9450094  0.65518972 0.2937214 0.75746413 0.1241625  0.00434304 0.61652593]
 [0.53914146 0.35131512 0.97329322 0.26000172 0.61166527 0.99942739 0.7813396  0.84670649 0.7177109  0.62133007]]
e: 1.755322357758388
e: 0.5079530627255521
e: 2.220446049250313e-16
[0.         0.         0.         0.         0.         0.
 2.44726583 0.         0.         0.        ]

额,似乎和我预想的不一样,是要先在字典内部进行迭代吗?

Another

图片分块并为列

class Divide:
    def __init__(self, b_w, b_h):
        '''
        b_w: block width
        b_h: block height
        '''
        self.block_width = b_w
        self.block_height = b_h

    def encode(self, mat):
        (W, H) = mat.shape
        # (192, 168)->(24,21)
        w_len = int(W / self.block_width)
        h_len = int(H / self.block_height)
        res = np.zeros((self.block_width * self.block_height, w_len * h_len))
        for i in range(h_len):
            for j in range(w_len):
                temp = mat[j * self.block_width:(j + 1) * self.block_width,
                           i * self.block_height:(i + 1) * self.block_height]
                temp = temp.reshape(self.block_width * self.block_height)
                res[:, i * w_len + j] = temp
        return res

    def decode(self, mat, W, H):
        '''
        mat.shape should be ( block_width*block_height, ~ = 24*21 )
        '''
        w_len = int(W / self.block_width)
        h_len = int(H / self.block_height)
        mat = mat.reshape(self.block_width * self.block_height, w_len * h_len)
        
        res = np.zeros((W, H))
        for i in range(h_len):
            for j in range(w_len):
                temp = mat[:, i * w_len + j]
                temp = temp.reshape(self.block_width, self.block_height)
                res[j * self.block_width:(j + 1) * self.block_width,
                    i * self.block_height:(i + 1) * self.block_height] = temp
        return res

_ _ m a i n _ _ \_\_main\_\_ __main__

if __name__ == '__main__':
    # 1.检查数据集中的数据特征,确定图片分块大小 (120, 165)
    #   并将无遮挡的人脸作为训练数据,有遮挡的人脸作为测试数据。

    # dataset = dataMaker('D:\\MINE_FILE\\dataSet\\AR', 'AR', use_num=40)
    # dataset = dataMaker('D:\\MINE_FILE\\dataSet\\YaleB', 'YaleB')
    dataset = dataMaker('./image/classed_pack/', 'classed_pack')
    
    # 2.应用SCR算法进行字典构建并对测试集进行基于分块投票的分类;
    src_algorithm = SRC(dataset, max_iter=100, tol=1e-5)

    # AR
    src_algorithm.makeDictionary(newShape=(60,50),block=(1,1))
    # src_algorithm.makeDictionary(newShape=(120,160),block=(5,5))

    # YaleB
    # src_algorithm.makeDictionary(newShape=(120, 160), block=(5, 5))

    src_algorithm.run()
    # 3.统计分类结果与准确率。
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值