Cartoon+Texture Variational Image Decomposition
This numerical tour explores the use of variational energies to decompose an image into a cartoon and a texture layer.
Contents
Installing toolboxes and setting up the path.
You need to download the following files: signal toolbox and general toolbox.
You need to unzip these toolboxes in your working directory, so that you have toolbox_signal and toolbox_general in your directory.
For Scilab user: you must replace the Matlab comment '%' by its Scilab counterpart '//'.
Recommandation: You should create a text file named for instance numericaltour.sce (in Scilab) or numericaltour.m (in Matlab) to write all the Scilab/Matlab command you want to execute. Then, simply run exec('numericaltour.sce'); (in Scilab) or numericaltour; (in Matlab) to run the commands.
Execute this line only if you are using Matlab.
getd = @(p)path(p,path); % scilab users must *not* execute this
Then you can add the toolboxes to the path.
getd('toolbox_signal/'); getd('toolbox_general/');
A variational image separation finds a decomposition [Math Processing Error] where [Math Processing Error] and [Math Processing Error] are solutions of an optimization problem of the form [Math Processing Error]
where [Math Processing Error] is a cartoon image prior (that favors edges) and [Math Processing Error] is a texture image prior (that favors oscillations). The parameters [Math Processing Error] should be adapted to the noise level and the amount of edge/textures.
When no noise is present in [Math Processing Error], so that [Math Processing Error], on minimizes [Math Processing Error]
In this tour, we define [Math Processing Error] as the total variation prior. For [Math Processing Error], we use the Hilbert norm framework introduced in:
Constrained and SNR-based Solutions for TV-Hilbert Space Image Denoising, Jean-François Aujol and Guy Gilboa, Journal of Mathematical Imaging and Vision, volume 26, numbers 1-2, pages 217-237, November 2006.
Total Variation Cartoon Prior
The total variation is a Banach norm. On the contrary to the Sobolev norm, it is able to take into account step edges.
First we load a textured image.
n = 256;
name = 'barb';
f = rescale( crop(load_image(name),n) );
Display it.
clf; imageplot(f);
The total variation of a smooth image [Math Processing Error] is defined as [Math Processing Error]
It is extended to non-smooth images having step discontinuities.
The total variation of an image is also equal to the total length of its level sets. [Math Processing Error]
Where [Math Processing Error] is the level set at [Math Processing Error] of the image [Math Processing Error] [Math Processing Error]
The Gradient of the TV norm is [Math Processing Error]
The gradient of the TV norm is not defined if at a pixel [Math Processing Error] one has [Math Processing Error]. This means that the TV norm is difficult to minimize, and its gradient flow is not well defined.
To define a gradient flow, we consider instead a smooth TV norm [Math Processing Error]
This corresponds to replacing [Math Processing Error] by [Math Processing Error] which is a smooth function.
We display (in 1D) the smoothing of the absolute value.
u = linspace(-5,5)'; clf; subplot(2,1,1); hold('on'); plot(u, abs(u), 'b'); plot(u, sqrt(.5^2+u.^2), 'r'); title('\epsilon=1/2'); axis('square'); subplot(2,1,2); hold('on'); plot(u, abs(u), 'b'); plot(u, sqrt(1^2+u.^2), 'r'); title('\epsilon=1'); axis('square');
In the following we set a small enough regularization parameter [Math Processing Error].
epsilon = 1e-2;
Compute the (smoothed) total variation of [Math Processing Error].
J = @(u)sum(sum( sqrt( epsilon^2 + sum3(grad(u).^2,3) ) ));
disp(['J(f) = ' num2str(J(f),3)]);
J(f) = 6.96e+03
TV-[Math Processing Error] Model
The simplest decomposition method performs a total variation denoising: [Math Processing Error]
It corresponds to the TV-[Math Processing Error] model of Rudin-Osher-Fatermi, because the texture prior is the [Math Processing Error] norm: [Math Processing Error]
This a poor texture prior because it just assumes the texture has a small overall energy, and does not care about the oscillations.
Define the regularization parameter [Math Processing Error].
lambda = .2;
The step size for diffusion should satisfy: [Math Processing Error]
tau = 1.9 / ( 1 + lambda * 8 / epsilon);
Initialization of the minimization.
u = f;
The Gradient of the smoothed TV norm is [Math Processing Error]
Shortcut for the gradient of the smoothed TV norm.
GradJ0 = @(Gr)-div( Gr ./ repmat( sqrt( epsilon^2 + sum3(Gr.^2,3) ) , [1 1 2]) ); GradJ = @(u)GradJ0(grad(u));
One step of descent.
u = u - tau*( u - f + lambda* GradJ(u) );
Exercice 1: (the solution is exo1.m) Compute the gradient descent and monitor the minimized energy.
exo1;
Display the cartoon layer.
clf; imageplot(u);
Shortcut to increase the contrast of the textured layer for better display.
rho = .8; % constrast factor eta = .2; % saturation limit displaytexture0 = @(x)sign(x).*abs(x).^rho; displaytexture = @(v)displaytexture0( clamp(v,-eta,eta)/eta );
Display the textured layer.
clf; imageplot( displaytexture(f-u) );
Gabor Hilbert Energy
To model the texture, one should use a prior [Math Processing Error] that favors oscillations. We thus use a weighted [Math Processing Error] norms computed over the Fourier domain: [Math Processing Error] where [Math Processing Error] is the weight associated to the frequency [Math Processing Error].
This texture norm can be rewritten using the Fourier transform [Math Processing Error] as [Math Processing Error]
To favor oscillation, we use a weight that is large for low frequency and small for large frequency. A simple Hilbert norm is a inverse Sobolev space [Math Processing Error].
It was first introduced in:
S.J. Osher, A. Sole, and L.A. Vese, Image decomposition and restoration using total variation minimization and the [Math Processing Error] norm, SIAM Multiscale Modeling and Simulation, 1(3):349-370, 2003.
This Hilbert norm is defined using [Math Processing Error] where [Math Processing Error] is a small constant that prevents explosion at low frequencies.
eta = .05; x = [0:n/2-1, -n/2:-1]/n; [Y,X] = meshgrid(x,x); W = 1 ./ (eta + sqrt(X.^2 + Y.^2));
Display the inverse weight, with 0 frequency in the middle.
imageplot(fftshift(1./W));
Compute the texture norm. The [Math Processing Error] normalization is intended to make the Fourier transform orthogonal.
T = @(v)1/2*norm( W.*fft2(v)/n, 'fro' ).^2; disp(['T(f) = ' num2str(T(f), 3) ] );
T(f) = 3.13e+06
The gradient of the texture norm is [Math Processing Error] where [Math Processing Error] is the inverse Fourier transform. Note that if [Math Processing Error], this gradient is the inverse Laplacian [Math Processing Error]
Define the filtering operator [Math Processing Error].
GradT = @(f)real(ifft2(W.^2.*fft2(f)));
This is a low pass filter.
imageplot(GradT(f));
Define its inverse [Math Processing Error].
GradTInv = @(f)real(ifft2(fft2(f)./W.^2));
It is a high pass filter.
imageplot(GradTInv(f));
TV-[Math Processing Error] Image Decomposition
The TV-Hilbert decomposition solves [Math Processing Error]
The mapping [Math Processing Error] is a smooth functional, it can thus be minimized using a gradient descent: [Math Processing Error]
The parameter [Math Processing Error] for the texture/cartoon tradeoff.
lambda = 5;
The gradient descent step size should satisfy: [Math Processing Error]
tau = 1.9 /( max(W(:))^2 + 8*lambda/epsilon );
Initial cartoon layer.
u = f;
Gradient descent step.
u = u - tau * ( GradT(u-f) + lambda*GradJ(u) );
Exercice 2: (the solution is exo2.m) Perform the gradient descent, monitor the decay of the energy.
exo2;
Display the cartoon layer.
clf; imageplot(u);
Display the textured layer.
clf; imageplot( displaytexture(f-u) );
TV-Gabor Image Decomposition
The [Math Processing Error] texture model is intended to capture very high frequency, and thus performs poorly for medium frequency textures.
To capture these patterns, we follow:
Structure-Texture Image Decomposition - Modeling, Algorithms, and Parameter Selection, Jean-Francois Aujol, Guy Gilboa, Tony Chan, and Stanley Osher, International Journal of Computer Vision, volume 67, number 1, pages 111-136, April 2006
and we use a radial weight profile centered around a frequency [Math Processing Error].
To determine the target frequency [Math Processing Error], we analyse a sub-window around a point [Math Processing Error] of the image containing approximately a single frequency.
Location [Math Processing Error] of the sub-window.
p = [125 200];
Size [Math Processing Error], in pixels, of the sub-window.
mu = 10;
Compute a Gaussian mask.
[Y,X] = meshgrid( 1:n, 1:n ); U = exp( ( -(X-p(1)).^2 - (Y-p(2)).^2 )/(2*mu^2) );
Display the masked image.
clf; imageplot(U.*f);
Remove the low frequencies from the Fourier transform, after centering.
F = fft2(U.*f); F = fftshift(F); F(end/2-20:end/2+20,end/2-20:end/2+20) = 0;
Compute the location [Math Processing Error] of the pick of the spectrum.
[tmp,i] = max(abs(F(:))); [xm,ym] = ind2sub([n n], i);
Display.
clf; hold on; imageplot(abs(F)); h = plot( [ym n-ym], [xm n-xm], 'r.' ); set(h, 'MarkerSize', 20);
Target frequency is the distance between [Math Processing Error] and the center frequency.
r = sqrt( (xm-n/2)^2 + (ym-n/2)^2 );
We use the following weights: [Math Processing Error] where [Math Processing Error] controls the precision one expect about the frequency location.
Radial weight profile.
sigma = 10; x = [0:n/2-1, -n/2:-1]; [Y,X] = meshgrid(x,x); R = sqrt(X.^2+Y.^2); W = 1 - exp( -(r-R).^2 / (2*sigma^2) );
Display the weight.
imageplot(fftshift(W));
Exercice 3: (the solution is exo3.m) Define the operators [Math Processing Error] and apply it to an images.
exo3;
Exercice 4: (the solution is exo4.m) For a well chosen value of [Math Processing Error], perform the TV-Hilbert decomposition with this texture kernel.
exo4;
Display the cartoon layer.
clf; imageplot(u);
Display the textured layer.
clf; imageplot( displaytexture(f-u) );