光谱重建In Defense of Shallow Learned Spectral Reconstruction from RGB Images， svd和k-svd

tony365

已于 2024-04-25 11:29:46 修改

阅读量796

点赞数 15

分类专栏：光谱重建文章标签： pytorch 计算机视觉

于 2024-04-25 11:17:09 首次发布

本文链接：https://blog.csdn.net/tywwwww/article/details/138183537

版权

光谱重建专栏收录该内容

2 篇文章 1 订阅

订阅专栏

文章目录

1.光谱重建论文：In Defense of Shallow Learned Spectral Reconstruction from RGB Images
2.奇异值分解SVD
3.字典学习k-svd

1.光谱重建论文：In Defense of Shallow Learned Spectral Reconstruction from RGB Images

1.1Anchored Neighborhood Regression for Fast Example-Based Super-Resolution

A+算法介绍

1.1.1首先介绍下Neighbor embedding approaches（NE）

首先有一个数据库，数据库包含低分辨率图像的像素或特征，和对应高分辨率图像的像素或者提取的特征
输入低分辨率的patch，提取特征，找到数据库中的K近邻，并根据距离计算weight
将weight应用到对应的高分辨率的patch, 加权得到高分辨率的patch
主要就是K近邻插值

1.1.2.然后介绍下 Sparse coding approaches（SC）

一般情况下使用K-SVD方法建立低分辨率字典和对应的高分辨率字典
输入低分辨率图像，然后OMP方法计算低分辨率字典的weight
weight应用到低分辨率字典对应的高分辨率字典，得到高分辨率图像

和NE方法是类似的，只不过从直接由样本表示变为了用字典表示。

1.1.3.提出的方法

公式3表示输入的低分辨率patch yF, 用附近的邻域 patch NL 加权表示
加权系数有公式4得到
高分辨率patch 则有公式5得到。

同理如果不是直接用example-based patch, 而是用字典，就是公式6和7

那么实际使用的时候是怎样的流程呢？
训练的时候，

首先K-SVD计算低分辨率字典和高分辨率字典
然后每个字典原子的K个邻域构建Dh, Dl, 并最终计算为PG（根据公式7）

在测试的时候，

输入低分辨率patch 找到最相近的字典原子，文章中只找一个原子dj
根据dj查找对应的PG
PG与低分辨率patch 相乘得到高分辨率patch(根据公式6)

在这里插入图片描述

4.patch 的特征选择有很多种

选取patch的特征：
减去均值，除以标准差
一阶和二阶导数
一阶和二阶导数，并应用PCA降维

normalized HR patches 通过 HR - bicubic LR 得到

1.2 论文Sparse recovery of hyperspectral signal from natural rgb images.

本文是对论文B. Arad and O. Ben-Shahar. Sparse recovery of hyperspectral signal from natural rgb images. In European Conference on Computer Vision, pages 19–34. Springer, 2016.的改进
Arad的论文的主要方法如下图：
在这里插入图片描述

1.3.本论文的方法

主要是结合论文 Sparse recovery of hyperspectral signal from natural rgb images
和论文 Anchored Neighborhood Regression for Fast Example-Based Super-Resolution

训练的时候：

k-svd得到 hsr dict
hsr data 和 hsr dict 通过 lsr projection(比如cie1964 color matching function) 转换为 lsr data和 lsr dict
normalize , 为每个dict atom 找到 c 个nearest neighbors，并计算论文中的P

测试的时候
每个rgb pixel找到最近的dict atom, 和对应的 P相乘

在这里插入图片描述

2.奇异值分解SVD

https://www.cnblogs.com/endlesscoding/p/10033527.html

概念

在这里插入图片描述

示例

在这里插入图片描述

code


"""
熟悉和了解svd的使用：https://www.cnblogs.com/endlesscoding/p/10033527.html
"""
import cv2
import numpy as np
from matplotlib import pyplot as plt


def svd_sample():
    mat = np.array([[1,5,7,6,1],
                    [2,1,10,4,4],
                    [3,6,7,5,2]])
    U,Sigma,VT = np.linalg.svd(mat)
    print('U: ',U)
    print('VT: ', VT)
    print('Sigma: ', Sigma) # 特征值可以看作反应矩阵信息内容的大小

    print("U是列单位正交矩阵，U@UT:", np.round(U@U.T,2))
    print("VT是行单位正交矩阵，V@VT:", np.round(VT.T @ VT,2))

def svd_sample_img(file, mode=1):
    """
    对图像进行svd分解 和 重建
    mode=0, 读取灰度图
    mode=1, 读取彩色图
    :return:
    """
    img = cv2.imread(file, mode)
    if mode == 1:
        img = img[..., ::-1]
    s = img.shape
    img2 = img.reshape(s[0], -1)

    # 奇异值分解
    U, sigma, VT = np.linalg.svd(img2)
    print(U.shape, VT.shape, sigma.shape)
    # 取前60个奇异值
    sval_nums = 60
    img_restruct1 = (U[:, 0:sval_nums]).dot(np.diag(sigma[0:sval_nums])).dot(VT[0:sval_nums, :])
    img_restruct1 = img_restruct1.reshape(*img.shape)
    img_restruct1 = np.clip(img_restruct1, 0, 255).astype(np.uint8)
    # 取前120个奇异值
    sval_nums = 120
    img_restruct2 = (U[:, 0:sval_nums]).dot(np.diag(sigma[0:sval_nums])).dot(VT[0:sval_nums, :])
    img_restruct2 = img_restruct2.reshape(*img.shape)
    img_restruct2 = np.clip(img_restruct2, 0, 255).astype(np.uint8)
    # 将图片显示出来看一下，对比下效果
    fig, ax = plt.subplots(1, 3, figsize=(24, 32))
    if mode:
        ax[0].imshow(img)
        ax[0].set(title="src")
        ax[1].imshow(img_restruct1)
        ax[1].set(title="nums of sigma = 60")
        ax[2].imshow(img_restruct2)
        ax[2].set(title="nums of sigma = 120")
        plt.show()
    else:
        ax[0].imshow(img, cmap='gray')
        ax[0].set(title="src")
        ax[1].imshow(img_restruct1, cmap='gray')
        ax[1].set(title="nums of sigma = 60")
        ax[2].imshow(img_restruct2, cmap='gray')
        ax[2].set(title="nums of sigma = 120")
        plt.show()
def svd_and_pca():
    print("两者具有比较大的相关性：https://zhuanlan.zhihu.com/p/58064462")

if __name__ == "__main__":
    # svd_sample()
    file = r'C:\20240410_000098_0100.png'
   
    svd_sample_img(file, 1)

运行结果：
在这里插入图片描述

3.字典学习k-svd

概念

如下图，样本数目为N, 字典word数目为K，每个样本的特征数目和每个字典word的特征维度都是n。要求K大于n,即字典的单词数要足够多。

然后X是稀疏矩阵，每一列表示 K个word的 weight。稀疏就是说这个k个数，只有比较少的数不为0.

总之就是，用比较少的几个字典word 加权就可以很好的表示样本。
假如只用一个word来表示样本，有点类似于kmeans了，因为kmeans就是无监督算法，利用一个中心来表示每个样本。这里k-svd就是利用几个字典word 加权表示每个样本。
在这里插入图片描述

稀疏表示模型

在这里插入图片描述

以上三个稀疏模型。

代码

算法流程：
在这里插入图片描述

上图算法流程对应的字典更新代码：

    def _update_dict(self, y, d, x):
        """
        使用KSVD更新字典的过程
        """
        for i in range(self.n_components):
            index = np.nonzero(x[i, :])[0]
            if len(index) == 0:
                continue
            d[:, i] = 0
            r = (y - np.dot(d, x))[:, index]
            u, s, v = np.linalg.svd(r, full_matrices=False)
            d[:, i] = u[:, 0].T
            x[i, index] = s[0] * v[0, :]
        return d, x

问题RuntimeWarning: Orthogonal matching pursuit ended prematurely due to linear dependence in the dictionary. The requested precision might not have been met. out = _cholesky_omp

1.了解问题背景 OMP算法(Orthogonal Matching Pursuit)是一种迭代稀疏表示算法，用于解决大量线性方程组的求解问题。它的核心思想是，在给定的一组字典D中，找到最少的原子使得它们的线性组合能够近似表示某一个向量。

2.理解报错原因出现RuntimeWarning提示是由于字典中出现了线性相关性，导致OMP算法提前结束。也就是说，在字典D中存在某些原子可以用其他原子的线性组合表示出来，这时候就出现了线性相关性。

3.解决方法我们可以通过以下两种方法来解决该问题：
(1) 对字典D进行处理，去除其中的线性相关原子。
方法很简单，在OMP算法之前，在字典D中添加一步去除线性相关原子的步骤，可以保证字典D中不存在线性相关原子，例如：

import numpy as np from sklearn 
import linear_model 
def remove_linear_dependence(D): 
    U, s, V = np.linalg.svd(D) 
    tol = s.max()*max(D.shape)*np.finfo(s.dtype).eps 
    independent_indices = np.where(s>tol)[0] 
    D_independent = D[:, independent_indices] 
    return D_independent 

# 原始字典 
D_raw = np.array([[1, 0, 0], [0, 1, 0], [1, 1, 0], [0, 0, 1]]) print(D_raw) 
# 去除线性相关原子后的字典 
D_independent = remove_linear_dependence(D_raw) 
print(D_independent)

(2) 调整OMP算法的参数n_nonzero_coefs，降低精度要求。
在sklearn中，n_nonzero_coefs是OMP算法中的一个参数，表示最终稀疏表示中非零系数的最大数量。我们可以通过降低n_nonzero_coefs的取值，降低精度要求，以防止线性相关性出现。例如：

import numpy as np 
from sklearn import linear_model 
# 数据和字典 
y = np.array([1, 2, 3, 4]) 
D = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) 
# 调整n_nonzero_coefs参数为1 
omp = linear_model.OrthogonalMatchingPursuit(n_nonzero_coefs=1) 
omp.fit(D, y) 
# 输出系数和拟合结果 
coef = omp.coef_ 
print(coef) 
print(np.dot(D, coef))

4.深入理解问题原因
线性相关性是数学中的基本概念之一，指的是存在一组向量可以用其他向量的线性组合来表示，从而导致某些向量不是独立的。
线性相关性的存在会影响OMP算法的稀疏表示结果。在字典D中存在线性相关原子时，OMP算法有可能会选择其中的某个原子，而忽略其他几个原子。
这样会导致稀疏表示的精度降低，甚至无法满足精度要求。
因此，我们需要去除字典D中的线性相关原子，或者降低n_nonzero_coefs参数的取值，来解决该问题。
需要注意的是，字典D的选择和构建对OMP算法的稀疏表示结果有非常大的影响。合理的字典可以提高算法的精度和可靠性。

示例k-svd处理图像patch, 填充像素或者去噪

spectral_svd01.py

 #coding:UTF-8
import numpy as np
from sklearn import linear_model
import cv2


class KSVD(object):
    def __init__(self, n_components=256, max_iter=30, tol=1e-6,
                 n_nonzero_coefs=None):
        """
        稀疏模型Y = DX，Y为样本矩阵，使用KSVD动态更新字典矩阵D和稀疏矩阵X
        :param n_components: 字典所含原子个数（字典的列数）
        :param max_iter: 最大迭代次数
        :param tol: 稀疏表示结果的容差
        :param n_nonzero_coefs: 稀疏度
        """
        self.dictionary = None
        self.sparsecode = None
        self.max_iter = max_iter
        self.tol = tol
        self.n_components = n_components
        self.n_nonzero_coefs = n_nonzero_coefs

    def _initialize(self, y):
        """
        用随机二阶单位范数初始化字典矩阵
        """
        # 字典的形状
        shape=[y.shape[0],self.n_components] # 64, n_components个word
        #对每一列归一化为L2-norm
        self.dictionary = np.random.random(shape)
        for i in range(shape[1]):
            self.dictionary[:, i]=self.dictionary[:, i]/np.linalg.norm(self.dictionary[:, i])

    def _update_dict(self, y, d, x):
        """
        使用KSVD更新字典的过程
        """
        for i in range(self.n_components):
            index = np.nonzero(x[i, :])[0]
            if len(index) == 0:
                continue
            d[:, i] = 0
            r = (y - np.dot(d, x))[:, index]
            u, s, v = np.linalg.svd(r, full_matrices=False)
            d[:, i] = u[:, 0].T
            x[i, index] = s[0] * v[0, :]
        return d, x


    def fit(self, img):
        """
        KSVD迭代过程
        """
        # img的长宽一样，且是8的倍数
        shape = img.shape
        ps = 8
        grid = shape[0] // ps
        #将图像按8*8的块转化列向量，合起来成为64*1024的矩阵
        #img保存原始图像转化的矩阵，y用于保存img减去列均值后的矩阵
        y=np.zeros((ps*ps, grid*grid))
        img_reshape=np.zeros((ps*ps, grid*grid))

        patch_num=grid**2
        for patch_index in range(patch_num):
            #按先行后列，grid*grid个8*8的小块并装换为列向量
            r=(patch_index//grid)*8
            c=(patch_index%grid)*8
            patch=img[r:r+8, c:c+8].flat
            normalize=np.linalg.norm(patch)
            mean=np.sum(patch)/64
            #print mean
            img_reshape[:, patch_index]=patch
            #y[:, patch_index]=(patch/mean)
            y[:, patch_index]=(patch-mean*np.ones(64))/normalize # 减去均值，除以二范数
        # y是 64 x 1600: 1600个样本，每个样本64dim
        #字典初始化
        self._initialize(y)
        # 迭代求解dict
        for i in range(self.max_iter):
            #linear_model.orthogonal_mp 用法详见：
            #http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.orthogonal_mp.html
            x = linear_model.orthogonal_mp(self.dictionary, y, n_nonzero_coefs=self.n_nonzero_coefs)#OMP
            e = np.linalg.norm(y- np.dot(self.dictionary, x))
            print( '第%s次迭代，误差为：%s' %(i, e) )
            if e < self.tol:
                break
            self._update_dict(y, self.dictionary, x)

        # 重建训练图片
        src_rec=np.zeros(img.shape)
        for patch_index in range(patch_num):
            x = linear_model.orthogonal_mp(self.dictionary, y[:, patch_index], n_nonzero_coefs=self.n_nonzero_coefs)
            #x = linear_model.orthogonal_mp(self.dictionary, img_reshape[:, patch_index], n_nonzero_coefs=self.n_nonzero_coefs)
            nomalize=np.linalg.norm(img_reshape[:, patch_index])
            mean=np.sum(img_reshape[:, patch_index])/64
            #patch=np.dot(self.dictionary, x)+mean*np.ones(64)
            patch=np.dot(self.dictionary, x)*nomalize+mean*np.ones(64)  # normalize和mean也可以在矩阵中做，不必要for循环
            r=(patch_index//grid)*8
            c=(patch_index%grid)*8
            src_rec[r:r+8, c:c+8]=patch.reshape((8,8))

        return self.dictionary, src_rec

    def missing_pixel_reconstruct(self, img):
        shape = img.shape
        ps = 8
        grid = shape[0] // ps
        img_patchs=img_to_patch(img)
        patch_num=img_patchs.shape[1]
        #patch_dim=img_patchs.shape[0]
        for i in range(patch_num):
            img_col=img_patchs[:, i]

            ##################### pixel miss
            # index = np.nonzero(img_col)[0]
            # #对每列去掉丢失的像素值后求平均、二阶范数，将其归一化
            # l2norm=np.linalg.norm(img_col[index])
            # mean=np.sum(img_col)/index.shape[0]
            # img_col_norm=(img_col-mean)/l2norm

            # x = linear_model.orthogonal_mp(self.dictionary[index, :], img_col_norm[index].T, n_nonzero_coefs=self.n_nonzero_coefs)
            # img_patchs[:, i]=(self.dictionary.dot(x)*l2norm)+mean

            #########################pixel noise
            img_col_norm = (img_col - img_col.mean()) / np.linalg.norm(img_col)
            x = linear_model.orthogonal_mp(self.dictionary, img_col_norm.T, n_nonzero_coefs=self.n_nonzero_coefs)
            img_patchs[:, i]=(self.dictionary.dot(x)*np.linalg.norm(img_col))+img_col.mean()

        return patch_to_img(img_patchs)


def pixel_miss(ori, per=0.3):
    img = ori.copy()
    shape = img.shape
    #rand=np.random.random(shape)
    n=int(per*shape[0]*shape[1])
    for i in range(n):
        rand_r=int(np.random.random()*shape[0])
        rand_c=int(np.random.random()*shape[1])
        img[rand_r, rand_c]=0
    return img

#高斯噪点
def Gauss_noise(ori,sigma=20):
    #sigma=np.sqrt(sigma)
    img=ori.copy().astype(np.float64)
    shape=img.shape
    img=img+(np.matlib.randn(shape))*sigma
    return img.astype(np.uint8)

#计算PSNR值
def psnr(A, B):
    if (A==B).all(): return 0
    return 10*np.log10(255*255.0/(((A.astype(np.float32)-B)**2).mean()))

#将8*8块为列向量的矩阵还原为原矩阵
def patch_to_img(patchs):
    patch_num=patchs.shape[1]
    size=np.sqrt(patch_num).astype(np.int32)
    patch_size=np.sqrt(patchs.shape[0]).astype(np.int32)
    img=np.zeros((patch_size*size, patch_size*size))
    for i in range(patch_num):
        r=(i//size)*8
        c=(i%size)*8
        img[r:r+8, c:c+8]=patchs[:, i].reshape((8, 8))
    return img

#将图像分割为8*8块作为列向量
def img_to_patch(img):
    shape = img.shape
    ps = 8
    grid = shape[0] // ps
    patchs=np.zeros((8*8, grid*grid))
    blocks_r=img.shape[0]//8
    blocks_c=img.shape[1]//8
    patch_num=blocks_r*blocks_c
    for i in range(patchs.shape[1]):
        #按先行后列，将图片分解成32*32个8*8的小块并装换为列向量
        r=(i//blocks_r)*8
        c=(i%blocks_c)*8
        patch=img[r:r+8, c:c+8].flat
        patchs[:, i]=patch
    return patchs

"""
一个图像去噪的示例
"""
if __name__ == "__main__":
    file1 = r'C:\calib_dataset_pair_v1000_6946\20240410_000098_0100.png'
    file2 = r'C:\calib_dataset_pair_v1000_6946\20240410_000098_0100_mean_gt1.png'
    # file = r'C:\calib_dataset_pair_v1000_6946\20240410_000034_0100.png'
    # file = r'C:\calib_dataset_pair_v1000_6946\20240410_000034_0100_mean_gt1.png'

    #读入原图
    ori = cv2.imread(file1, 0).astype(np.float32)
    #像素丢失后的图
    #img=pixel_miss(ori)
    img = ori.copy()
    #训练字典所用的图
    train = cv2.imread(file2,0).astype(np.float32)

    #展示原图、破坏图、训练用图
    cv2.namedWindow("Original")
    cv2.imshow("Original",ori.astype(np.uint8))

    cv2.namedWindow("Destory")
    cv2.imshow("Destory",img.astype(np.uint8))
    print('破坏像素点后图像PSNR值：', psnr(ori, train))

    cv2.namedWindow("Train")
    cv2.imshow("Train", train.astype(np.uint8))

    #最大迭代次数设为80
    ksvd = KSVD(max_iter=80)
    dictionary, src_rec = ksvd.fit(train)

    #按块展示字典
    cv2.namedWindow("Dictionary")
    dictionary=dictionary-np.amin(dictionary)
    dictionary=dictionary/np.amax(dictionary)
    cv2.imshow("Dictionary", patch_to_img(dictionary))#.astype(np.uint8))

    #用训练得到的字典还原图像
    cv2.namedWindow("K-SVD_Rec")
    img=ksvd.missing_pixel_reconstruct(img)
    cv2.imshow("K-SVD_Rec", img.clip(0,255).astype(np.uint8))
    print('利用训练获得的字典重构图像后PSNR值：', psnr(train, img))

    #用训练得到的字典还原训练用图
    cv2.namedWindow("K-SVD")
    cv2.imshow("K-SVD",src_rec.clip(0,255).astype(np.uint8))
    print('用字典重构训练集合信号PSNR：', psnr(train,src_rec))

    cv2.waitKey(0)

K-SVD独立代码

import cv2
import numpy as np
from sklearn import linear_model
from sklearn.preprocessing import normalize
import scipy.misc
from matplotlib import pyplot as plt
import random

"""
代码参考：https://blog.csdn.net/ouyangfushu/article/details/88978386
note:
y ：样本    m x n,  每个样本m个维度，n个样本
D : 字典    m x K,  每个字典词n个维度，K个字典词
X : 稀疏稀疏 K x n, 表示n个样本的 稀疏，每列就是每个样本对应的K个字典词的系数
"""
class KSVD(object):
    def __init__(self, k, max_iter=30, tol=1e-6,
                 n_nonzero_coefs=None):
        """
        稀疏模型Y = DX，Y为样本矩阵，使用KSVD动态更新字典矩阵D和稀疏矩阵X
        :param n_components: 字典所含原子个数（字典的列数）
        :param max_iter: 最大迭代次数
        :param tol: 稀疏表示结果的容差
        :param n_nonzero_coefs: 稀疏度
        """
        self.dictionary = None
        self.sparsecodeX = None
        self.max_iter = max_iter
        self.sigma = tol
        self.k_components = k
        self.n_nonzero_coefs = n_nonzero_coefs

    def _initialize(self, y):

        # u, s, v = np.linalg.svd(y)
        # self.dictionary = u[:, :self.k_components]
        """
        随机选取样本集y中n_components个样本,并做L2归一化
        #  """
        ids = np.arange(y.shape[1])  # 获得列索引数组
        select_ids = random.sample(ids, self.k_components)  # 随机选取k_components个样本的id,k-svd之K
        mid_dic = y[:, np.array(select_ids)]  # 数组切片提取出k个样本
        self.dictionary = normalize(mid_dic, axis=0, norm='l2')  # 每一列做L2归一化
        print('dictionary shape:', self.dictionary.shape)

    def _update_dict(self, y, d, x):
        """
        使用KSVD更新字典的过程
        """
        for i in range(self.k_components):
            index = np.nonzero(x[i, :])[0]  # 非零项索引数组
            if len(index) == 0:
                continue

            d[:, i] = 0
            r = (y - np.dot(d, x))[:, index]  # 获取非零项对用id的列
            u, s, v = np.linalg.svd(r, full_matrices=False)  # SVD分解
            d[:, i] = u[:, 0]
            x[i, index] = s[0] * v[0, :]
        return d, x

    def fit(self, y):
        """
        KSVD迭代过程
        """
        self._initialize(y)
        for i in range(self.max_iter):
            x = linear_model.orthogonal_mp(self.dictionary, y, n_nonzero_coefs=self.n_nonzero_coefs)
            e = np.linalg.norm(y - np.dot(self.dictionary, x))
            print( '第%s次迭代，误差为：%s' %(i, e) )
            if e < self.sigma:
                break
            self._update_dict(y, self.dictionary, x)

        self.sparsecodeX = linear_model.orthogonal_mp(self.dictionary, y, n_nonzero_coefs=self.n_nonzero_coefs)
        return self.dictionary, self.sparsecodeX

    def predict(self, yy):
        x = linear_model.orthogonal_mp(self.dictionary, yy, n_nonzero_coefs=self.n_nonzero_coefs)
        return np.dot(self.dictionary, x)