Category-Specific Object Image Denoising论文阅读笔记

Present a novel image denoising algorithm that uses external, category specific image database.

I. Introduction


Remove noise from images (or image regions) that depict a single object of a known class.


面部图像增强  face image enhancement
文档图像恢复  document image recovery,
数字遗产         digital heritage
细胞图像分析  cell image analysis
图像美学评价  image aesthetics

State-of-the-art techniques:

1. 内部图像库去噪法 (The use of internal image datasets for denoising)

 Exploit repetitive local patterns that frequently occur in natural images, by selecting and grouping similar patches for collaborative denoising.

       1)non-local means:
       2)Collaborative filtering with block matching:

2. 外部图像库去噪法(The use of external image datasets for denoising)

There has been effort in utilising class-specific priors for image deblurring [25], [26], but these approaches are not directly applicable to denoising.



    1)寻找一个给定的含噪图像块的外部相似块support patches”的策略
         A strategy for finding similar external patches to a given noisy patch within the same object part,
    which we term“support patches” hereafter.
         A formulation of the object category-specific patch denoising problem in a transform domain.
    3)对一个给定含噪图像块所对应的support patch group建高斯模型
         A Gaussian model of the membership likelihood to a support patch group for a noisy patch.
    4)强制保持含噪块和相应的support patches之间相似性的低秩约束
         A low-rank constraint to enforce the similarity between the noisy patch and its support patches.



x:the true pixel value
y:the noisy value y at the same pixel

:Gaussian noise with a standard deviationσn

We consider problem of recovering the latent (true) image, given the noisy image of an object and a dataset of noisefree images in the same object category.

The matrices X, Yrepresent the pixel values of the true and observed images
The set of matrices {Zk:k=1, . . . ,K}denote the external dataset


A. Support Patch Search


yii=1, . . . ,M, 其对应的干净图像X中的图像块为xii=1, . . . ,M
     Collect all the overlapping patches of the noisy image Y, and denote the intensity vector of the patch centered at thei-th pixel byyii=1, . . . ,M,. Likewise,xidenotes the patch intensity vector for the corresponding location in the latent imageX.

     Select a preset number (L) of external images that are structurally most similar to the noisy image based on the structural similarity (SSIM) index.


     From the l-th candidate image (l=1, . . .L), we obtain a poolPi,lof patches that are similar to a given noisy patchyi.
     Take into account the difference in resolution and aspect ratio between the input and the candidate image when determining the local search window:

     Within each patch pool Pi,lwe only retain those that have a Euclidean distance from the input patchyithat is below a thresholdτ. We denote the resulting set of refined patches bySi,l.

5)将yi对应的L个图像块池Si,l聚合到一起,针对yi进行k-NN搜索出个Ti相似块{zi,j:j=1, . . . ,Ti}
     Aggregate the refined patch pools Si,l across the candidate images. Within the resulting collection, we perform a k-NN search for the most similar patches toyi. In the end, we obtain a set of support patches{zi,j:j=1, . . . ,Ti}resembling the noisy patchyi.



B. Transform Domain Formulation

In our formulation, we opt to represent local patches in a transform domain, rather than the patch intensity domain.

 Matching patches in the original space of patch intensity vectors is susceptible to a bias in the overall patch intensity, such as local illumination.




     Subtract the mean patch intensity from the patch intensity vector before performing the domain transform.


     Representing images and patches in the discrete transform domain ,we choose to use the DCT transform.



C. Data Fidelity

Assuming the independence of individual pixel values, the conditional likelihood of the noisy image given the original (noise-free) image is :



 Expressing the data fidelity in terms of the transform coefficients, we obtain :


D. Support Patch Group Membership

 Now we definean additional constraintthat imposes the similarity between a noisy patch and those from an image dataset belonging to the same object category.  


Ti support patches{zi,j:j=1, . . . ,Ti}

Let the transform coefficients ofzi,jbe{γi,j:j=1, . . . ,Ti} 


 be the mean and covariance matrix estimated from these transform coefficient vectors 



E. Low-Rank Constraint

We further formulate a low-rank constraint concerning a noisy patch and its support patches. 


When similar patch vectors are stacked as columns of a matrix, the matrix should exhibit the low rank property and have sparse singular values. 


However, the rank minimisation problem is NP-hard, and thus is intractable to solve directly.

The low-rank approximating matrix can be recovered exactly by solving the nuclear norm minimisation(NNM) problem. 





A. Patch Denoising

We can minimise the overall objective function in Equation 5 by minimising each of the termLi independently. 

For a patch xi, we then minimiseLiwith respect to the transformαiand the variableMi 。





We employ an iterative procedure to minimise the cost function in Equation 7. Each iteration involves an alternating optimization scheme with respect to eitherαiorMi, while fixing the other. 


B. Recovering Latent Image

We apply the iterative input regularisation technique :


C. Algorithm Implementation



A. Datasets and Parameter Settings


     CMU PIE face dataset [32], 

     Car dataset [33], 

     Cat dataset [34], 

     Gore face dataset [35] 
     the Multiview dataset [36]


     For each dataset, we randomly selected half of the images to form a category-specific dataset and between 10 and 15 images from the remaining half as ground-truth images for denoising. 


     We have disjoint image sets for the test and training.


Generate noisy images :

Corrupt the test images by additive white Gaussian noise with standard deviations (std) ofσn=30,50,70,100

A patch size of 8
Candidate images L=16 
A search window with a size of 51 ×51
The number of nearest neighbors kis set to 16
λ0 =1,λ1=0.5,λ2=10 and ρ=0.18

Evaluation :


Compare our proposed method with :

    BM3D [4], 

    WNNM [38],

    NLM [3], 

    SAPCA [5],
    TSID [9], 

    EPLL [18], 

    PCLR [19], 

    PGPD [20] 

    TID [23] 


B. Influence of External Dataset Size

The robustness of our algorithm to the dataset size, showing that an increasing dataset size only slightly improves the denoising accuracy .

C. Influence of the Number of Support Patches

The average PSNR declines as the number of support patches increases .
D. Relative Importance of Priors

We assess the relative contribution ofthe Gaussian prior and the low rank term on the Gore dataset forσ=50. 

The presence of both terms improves the PSNR compared to the scenario where one is absent. 

E. Run-Time Comparisons

We observe that our method spends most of its time on patch search. The speed of our algorithm can be improved by applying fast patch search algorithmse.g., KD tree [40] and patch match [41], [42]. 


In addition, GPU implementations can be employed to parallelise the denoising of patches in independent threads.



F. Role of External Image Category

The PSNR for each noisy image category reaches its maximum when the dataset belongs to the same category .


G. Sensitivity to Pose Variations

H. Comparisons With Internal Denoising Methods

The proposed algorithms can restore high-frequency details with a closer resemblance to the ground truth than the existing internal denoising methods. 


Specifically, the highly-textured pattern is clearly reproduced by our method .

I. Comparisons With External Denoising Methods

J. Robustness to Misalignment and Rotation

K. Extension to Color Images

LetYdenotes the luminancechannel, andUandVdenote the chrominance channels .

We specifically deal with the high noise variance in theYchannel with our method, while simply applying BM3D to the chrominance channels .


The key difference from existing external denoising methods is the formulation of the denoising problem in a transform domain .

An important question that requires more discussion is the behaviour and sensitivity of the algorithm even larger variations in pose, facial expressions, size, style, and view angle.



