Category-Specific Object Image Denoising论文阅读笔记
论文信息:
Abstract
提出一种利用外部的、特定类别的数据库的去噪算法。
Present a novel image denoising algorithm that uses external, category specific image database.
I. Introduction
本文目标:
移除描述一个已知类别中某个实例的图像(或图像区域)中的噪声。
Remove noise from images (or image regions) that depict a single object of a known class.
应用:
面部图像增强 face image enhancement
文档图像恢复 document image recovery,
数字遗产 digital heritage
细胞图像分析 cell image analysis
图像美学评价 image aesthetics
State-of-the-art techniques:
1. 内部图像库去噪法 (The use of internal image datasets for denoising)
利用自然图像中具有重复纹理重复性,选择相似图像块并对其分组来进行联合去噪。
Exploit repetitive local patterns that frequently occur in natural images, by selecting and grouping similar patches for collaborative denoising.
代表文献:
1)non-local means:
2)Collaborative filtering with block matching:
BM3D
2. 外部图像库去噪法(The use of external image datasets for denoising)
略。
There has been effort in utilising class-specific priors for image deblurring [25], [26], but these approaches are not directly applicable to denoising.
本文创新点:
1)寻找一个给定的含噪图像块的外部相似块support patches”的策略
A strategy for finding similar external patches to a given noisy patch within the same object part,
which we term“support patches” hereafter.
2)将在变换域中的特定类别物体去噪问题公式化
A formulation of the object category-specific patch denoising problem in a transform domain.
3)对一个给定含噪图像块所对应的support patch group建高斯模型
A Gaussian model of the membership likelihood to a support patch group for a noisy patch.
4)强制保持含噪块和相应的support patches之间相似性的低秩约束
A low-rank constraint to enforce the similarity between the noisy patch and its support patches.
II. DENOISING PROBLEM FORMULATION
含噪图像模型:
x:the true pixel value
y:the noisy value y at the same pixel
:Gaussian noise with a standard deviationσn
去噪目标:
We consider problem of recovering the latent (true) image, given the noisy image of an object and a dataset of noisefree images in the same object category.
符号表示:
The matrices X, Yrepresent the pixel values of the true and observed images
The set of matrices {Zk:k=1, . . . ,K}denote the external dataset
A. Support Patch Search
步骤:
1)收集含噪图像Y彼此间有重叠的图像块yii=1, . . . ,M, 其对应的干净图像X中的图像块为xii=1, . . . ,M
Collect all the overlapping patches of the noisy image Y, and denote the intensity vector of the patch centered at thei-th pixel byyii=1, . . . ,M,. Likewise,xidenotes the patch intensity vector for the corresponding location in the latent imageX.
2)基于SSIM指数,选择L幅(预设)与含噪图像结构最相似的外部图像
Select a preset number (L) of external images that are structurally most similar to the noisy image based on the structural similarity (SSIM) index.
3)从L幅候选图像中,我们可以搜索获得每一幅候选图像中与给定的含噪图像块yi相似的图像块池Pi,l。(考虑含噪图像和候选图像的分辨率和长宽比的差异)
From the l-th candidate image (l=1, . . .L), we obtain a poolPi,lof patches that are similar to a given noisy patchyi.
Take into account the difference in resolution and aspect ratio between the input and the candidate image when determining the local search window:
4)对于每一个图像块池Pi,l,我们仅保留与给定的含噪图像块yi欧氏距离小于阈值τ的图像块,并将这些图像块池命名为Si,l
Within each patch pool Pi,lwe only retain those that have a Euclidean distance from the input patchyithat is below a thresholdτ. We denote the resulting set of refined patches bySi,l.
5)将yi对应的L个图像块池Si,l聚合到一起,针对yi进行k-NN搜索出个Ti相似块{zi,j:j=1, . . . ,Ti}
Aggregate the refined patch pools Si,l across the candidate images. Within the resulting collection, we perform a k-NN search for the most similar patches toyi. In the end, we obtain a set of support patches{zi,j:j=1, . . . ,Ti}resembling the noisy patchyi.
流程图:
B. Transform Domain Formulation
选择在变换域而不是强度域里表示图形快。
In our formulation, we opt to represent local patches in a transform domain, rather than the patch intensity domain.
原因:在原始的强度向量空间里匹配图像块容易产生整体图像块强度偏差,例如:局部照度。
Matching patches in the original space of patch intensity vectors is susceptible to a bias in the overall patch intensity, such as local illumination.
具体操作:
1)在域变换前对块强度向量减去平均强度值
Subtract the mean patch intensity from the patch intensity vector before performing the domain transform.
2)使用DCT变换表示图像和图像块
Representing images and patches in the discrete transform domain ,we choose to use the DCT transform.
C. Data Fidelity
Assuming the independence of individual pixel values, the conditional likelihood of the noisy image given the original (noise-free) image is :
Expressing the data fidelity in terms of the transform coefficients, we obtain :
D. Support Patch Group Membership
Now we definean additional constraintthat imposes the similarity between a noisy patch and those from an image dataset belonging to the same object category.
Ti support patches{zi,j:j=1, . . . ,Ti}
Let the transform coefficients ofzi,jbe{γi,j:j=1, . . . ,Ti}
be the mean and covariance matrix estimated from these transform coefficient vectors
E. Low-Rank Constraint
We further formulate a low-rank constraint concerning a noisy patch and its support patches.
When similar patch vectors are stacked as columns of a matrix, the matrix should exhibit the low rank property and have sparse singular values.
However, the rank minimisation problem is NP-hard, and thus is intractable to solve directly.
The low-rank approximating matrix can be recovered exactly by solving the nuclear norm minimisation(NNM) problem.
III. OPTIMISATION
A. Patch Denoising
We can minimise the overall objective function in Equation 5 by minimising each of the termLi independently.
For a patch xi, we then minimiseLiwith respect to the transformαiand the variableMi 。
辅助变量:
We employ an iterative procedure to minimise the cost function in Equation 7. Each iteration involves an alternating optimization scheme with respect to eitherαiorMi, while fixing the other.
B. Recovering Latent Image
We apply the iterative input regularisation technique :
C. Algorithm Implementation
IV. EXPERIMENTS
A. Datasets and Parameter Settings
Datasets:
CMU PIE face dataset [32],
Car dataset [33],
Cat dataset [34],
Gore face dataset [35]
the Multiview dataset [36]
For each dataset, we randomly selected half of the images to form a category-specific dataset and between 10 and 15 images from the remaining half as ground-truth images for denoising.
We have disjoint image sets for the test and training.
Generate noisy images :
Corrupt the test images by additive white Gaussian noise with standard deviations (std) ofσn=30,50,70,100
A patch size of 8
Candidate images L=16
A search window with a size of 51 ×51
The number of nearest neighbors kis set to 16
λ0 =1,λ1=0.5,λ2=10 and ρ=0.18
Evaluation :
PSNR
Compare our proposed method with :
BM3D [4],
WNNM [38],
NLM [3],
SAPCA [5],
TSID [9],
EPLL [18],
PCLR [19],
PGPD [20]
TID [23]
B. Influence of External Dataset Size
The robustness of our algorithm to the dataset size, showing that an increasing dataset size only slightly improves the denoising accuracy .
C. Influence of the Number of Support Patches
The average PSNR declines as the number of support patches increases .
D. Relative Importance of Priors
We assess the relative contribution ofthe Gaussian prior and the low rank term on the Gore dataset forσ=50.
The presence of both terms improves the PSNR compared to the scenario where one is absent.
E. Run-Time Comparisons
We observe that our method spends most of its time on patch search. The speed of our algorithm can be improved by applying fast patch search algorithmse.g., KD tree [40] and patch match [41], [42].
In addition, GPU implementations can be employed to parallelise the denoising of patches in independent threads.
F. Role of External Image Category
The PSNR for each noisy image category reaches its maximum when the dataset belongs to the same category .
G. Sensitivity to Pose Variations
H. Comparisons With Internal Denoising Methods
The proposed algorithms can restore high-frequency details with a closer resemblance to the ground truth than the existing internal denoising methods.
Specifically, the highly-textured pattern is clearly reproduced by our method .
I. Comparisons With External Denoising Methods
J. Robustness to Misalignment and Rotation
K. Extension to Color Images
LetYdenotes the luminancechannel, andUandVdenote the chrominance channels .
We specifically deal with the high noise variance in theYchannel with our method, while simply applying BM3D to the chrominance channels .
V. CONCLUSION ANDFUTUREWORK
The key difference from existing external denoising methods is the formulation of the denoising problem in a transform domain .
An important question that requires more discussion is the behaviour and sensitivity of the algorithm even larger variations in pose, facial expressions, size, style, and view angle.