Abstract
在本工作中,基于图像的局部相位信息,提出了一个客观的指标,称为特征相似度指标的色调映射图像(FSITM)。为了评估音调映射算子(TMO),该指标将原始高动态范围(HDR)的局部加权平均相位角图与使用TMO方法输出计算出的相对应的音调映射图像进行比较。在两个标准数据库上的实验表明,本文提出的FSITM方法优于最先进的音调映射质量指数(TMQI)。此外,结合FSITM和TMQI指标,可以获得更高的性能。建议度量的MATLAB源代码可以在https://www.mathworks.com/matlabcentral/fileexchange/59814上找到
I. INTRODUCTION
在这里,人们对高动态范围(HDR)图像、HDR成像系统和HDR显示器越来越感兴趣。高动态范围图像的视觉质量远远高于传统的低动态范围(LDR)图像,从LDR到HDR的转变的意义可以与黑白电视向彩色电视[1]的重大转变相比较。在这个过渡时期,为了保证未来的兼容性,需要开发一种方法来将HDR图像转换成“最好的”LDR图像。对于这种转换,音调映射算子(TMOs)引起了相当大的兴趣。为了在非HDR显示器上实现可见性目的,色调映射操作符已被用于将HDR图像转换为与其LDR相关的图像。
不幸的是,TMO方法的表现不同,取决于要转换的HDR图像,这意味着必须为每个个案找到最佳的TMO方法。[2]和[3]提供了对HDR图像和视频的各种TMOs的调查。传统上,对TMO性能的评价是主观的。在[4]中,使用HDR监视器进行了主观评估。Mantiuk等人[5]提出了一种HDR可见差异预测器(HDR- vdp)来估计两幅HDR图像的可见差异,该工具也被扩展到动态范围独立的图像质量评估[6]。然而,作者并没有得出一个客观的分数,而是评估了HDR显示的评估工具的性能。虽然主观评估提供了真实和有用的参考,但它是一个昂贵和耗时的过程。相比之下,色调映射图像的客观质量评估可以实现TMOs[7],[8]的自动选择和参数调整。因此,色调映射图像的客观评价与图像的主观评价成正比,是目前研究的热点。
最近,[2]中提出了一个客观的指标,称为色调映射质量指数(TMQI),以客观地评估由TMO生成的单个LDR图像的质量。TMQI是基于基于ssim的结构保真度测量与统计自然度的结合:
式中,S和N分别表示结构保真度和统计自然度。H和L表示HDR和LDR图像。参数α和β决定了这两个因素的灵敏度,参数a(0≤a≤1)调整了它们的相对重要性。S和N的上界都是1,因此TMQI的上界也是1[8]。尽管TMQI显然比知名的图像质量评估指标(如SSIM[9]、MS-SSIM[10]和FSIM[11])更好地为色调映射图像提供了评估,但它的性能并不完美。Liu等人的[12]用多种基于视觉显著性的策略取代了TMQI中结构保真度映射的池化策略,以更好地评估色调映射图像的质量。他们研究了许多视觉显著性模型,并得出结论,通过将简单先验(SDSP)结合到TMQI的显著性检测,提供了比其他显著性检测模型更好的评估能力。
本文首先提出了一种基于图像相位信息的色调映射图像特征相似度指标(FSITM)。观察到图像的相位信息优于其幅度[13]。此外,生理证据表明,人类视觉系统对图像中相位信息高度有序的点有强烈的反应。基于这一假设,提出了几个质量评价指标[11],[15],[16]。在[11]中,使用了边缘强度映射的相位同余协方差最大矩。Hassen等人[15]利用局部相位相干性进行图像清晰度评估。Saha等人[16]提出了一种利用相位偏差敏感能量特征的图像质量评估方法。不幸的是,这些指标不能为色调映射图像提供可靠的评估。
本文提出的FSITM图像与[11],[15],[16]中提出的使用相位衍生特征类型的方法不同。我们的FSITM使用局部加权平均相位角(LWMPA)[17],这是一个基于局部相位的特征图。这种相位导出的映射与噪声无关,因此不需要设置参数进行噪声估计。拟议的FSITM评估真实世界场景的外观和最令人愉悦的人类视觉图像。
考虑到FSITM和TMQI,我们还提出了一个组合度量,FSITM TMQI,它提供了更好的音调映射图像的评估。在实验中,我们比较了我们提出的相似指数(FSITM, FSITM TMQI)和TMQI[2]在两个主要数据集[18],[19]上的客观得分。
II. THE PROPOSED SIMILARITY INDEX
提出了一种基于相位衍生特征图的色调映射图像的FSITM相似度指标。正如我们前面提到的,相位衍生特征已经成功地用于质量评估[11],[15],[16]。然而,与其他流行的质量评估指标(如SSIM及其变体[9],[10])相比,他们评估色调映射图像的结果并不可靠。因此,我们在本文中使用局部加权平均相角(LWMPA)映射,因为它是一个对噪声具有鲁棒性的特征。下面,我们简要描述LWMPA的理论和公式,然后讨论我们提出的基于该特征图的相似度指标。
在计算ph(x)时,有几个参数需要考虑。在我们的实验集中,我们确定了该操作的最佳固定值(见第三节)。与在其他研究[11],[15]中使用的相位导出的边缘图和局部相位不同,局部加权平均相位角ph(x)提供了很好的图像特征表示,包括物体的边缘和形状。由于ph(x)表示暗线和亮线,它可以用来评估颜色变化,这是TMOs的一个流行特征。此外,LWMPA是噪声无关的,不像[11],[15],[16]中使用的相位导出特征,它们对噪声很敏感,因此需要对噪声进行估计。一些ph(x)输出的例子如图1所示。
我们只使用ph(x)来计算FSITM。首先,通过对HDR (H)图像的值取对数(log(H)),将HDR (H)图像转换为LDR (L)。这个LDR图像被用作计算FSITM的参考图像之一。另一个参考图像是HDR图像本身。FSITM计算的细节如下。
给定输入图像H和L,以及LogH=log(H)图像,利用式(4)计算这三幅图像中每个通道C的ph(x)。FSITM是基于一个简单的事实,即两个对应通道的特征在它们的ph(x)映射中应该保持相同。如果所有特征类型都相同,则FSITM = 1;如果所有特征类型都不相同,则FSITM = 0。首先,我们定义用于计算FSITM的通道C的特征相似度指标:
我们还发现,结合FSITM和TMQI可以更好地评估色调映射图像。因此,我们根据以下公式提出了FSITM和TMQI的联合指标:
在大多数情况下,这两个指标的性质不同,导致它们相互之间的相似度估计误差较小。
III. EXPERIMENTAL RESULTS
为了评估提出的FSITM索引,我们使用了[2]和[19]中引入的数据集(数据集A)。第一个数据集包含15张HDR图像,每个HDR图像对应8张LDR图像。HDR图像是使用不同的TMOs生成的。LDRs的质量从1(最佳质量)到8(最差质量)进行排序。排名是根据对20个人的主观评价得出的。使用的第二个HDR数据集(dataset B)以及LDR图像的主观等级[19],[22]也是可用的。该数据集包含3张HDR图像,每个HDR图像对应14张LDR图像。
为了客观地评价实验中所考虑的各种相似指数的性能,我们使用了Spearman秩相关系数(SRCC)和Kendall秩相关系数(KRCC)度量。将提出的相似度指标(FSITMC, FSITMC TMQI)与TMQI[2]进行比较。结果列于表一和表二。TMQI的性能是根据在[18]中运行Yeganeh和Wang提供的Matlab源代码得到的分数列出的。
对于这两个数据集,FSITMG在SRCC和KRCC方面优于TMQI。一般来说,TMQI性能的变化比FSITM性能的变化要小。相比之下,FSITMR TMQI和FSITMG TMQI更为稳健,而且在SRCC和KRCC得分方面都优于FSITM和TMQI。
值得报告的是最近在文献[12],[23]中提出的其他指标的可用结果。在[12]中,作者报告了他们为数据集A[18]提出的索引的SRCC性能。其SRCC性能最小值为0.6905,平均为0.8408。他们的SRCC评分的标准差为0.0907。对于同一数据集,ref.[23]的中位性能报告如下:SRCC=0.8106和KRCC=0.5865。
影响局部加权平均相位角ph(x)的质量的参数有很多,即滤波尺度数Nρ,最小尺度滤波器的波长wLen,以及连续滤波器之间的尺度因子mult。在实验中,LogH图像设置为Nρ = 2, wLen = 2, mult = 2,而原始HDR图像设置为Nρ = 2, wLen = 8, mult = 8。使用两组不同参数的合理性在于图像特征的大小可能不同。总的来说,ph(x)这三个参数以及α的值影响了拟议指标的性能。
在本工作中,我们只使用了原始HDR图像及其对数图像LogH。值得一提的是,我们在其他颜色空间(如Lab和Yxy颜色空间)中尝试了在RGB颜色空间中定义FSITM的相同策略。然而,我们没有得到一个好的表现。
我们对FSITM和TMQI的运行时间进行了如下评估:我们的实验是在Core i7 3.4 GHz CPU和16gb RAM上进行的。FSITM算法在Windows 7环境下的MATLAB 2012b中实现。TMQI和FSITM分别花费1.95秒和3.36秒来评估大小为1200×1600的图像,而FSITMC TMQI的运行时间是简单地通过添加TMQI和FSITMC运行时间来获得的。
IV. CONCLUSION
我们提出了一种客观的指标,称为色调映射图像的特征相似度指标(FSITM),它是基于原始HDR图像和目标转换LDR图像的局部相位相似度。与其他使用不同相位导出特征映射的研究不同,我们使用了局部加权平均相角,这是一个鲁棒的和噪声无关的特征映射。在两个数据集上,将所提出的相似度指标与最先进的TMQI进行了性能比较,结果表明所提出的相似度指标具有良好的应用前景。然后,将拟议的FSITM和TMQI结合起来,以获得更准确的质量评估。需要进一步的研究来开发更全面的HDR数据集,以及他们的主观评分。这样的数据集将使我们能够开发出性能更好的索引。
代码
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import numpy as np
import sys
from numpy.fft.fftpack import fft2, ifft2
from numpy.fft import ifftshift
from contracts import contract
from functools import partial
if not (sys.version_info >= (3, 5)):
sys.stdout.write("Sorry, requires Python 3.5 or newer.\n")
sys.exit(1)
# Feature similarity index for tone mapped images (FSITM)
# By: Hossein Ziaei Nafchi, November 2014
# hossein.zi@synchromedia.ca
# Synchromedia Lab, ETS, Canada
# The code can be modified, rewritten and used without obtaining permission
# of the authors.
# Please refer to the following paper:
# FSITM: A Feature Similarity Index For Tone-Mapped Images
# Hossein Ziaei Nafchi, Atena Shahkolaei,
# Reza Farrahi Moghaddam, Mohamed Cheriet,
# IEEE Signal Processing Letters, vol. 22, no. 8, pp. 1026-1029, 2015.
# DOI: 10.1109/LSP.2014.2381458
# Needs phasecong100 function (which requires lowpassfilter function)
# implemented later
@contract(HDR='array[NxM](float)', LDR='array[NxM](float)')
def FSITM(HDR, LDR, alpha = None):
"""
HDR: High dynamic range image
LDR: Low dynamic range image
The original implementation used color channels and
selected a single channel for processing. This makes limited
sense, so this reimplementation requires a single channel.
If any color conversion is needed, it should be performed
before this function is called.
"""
# HDR: High dynamic range image
# LDR: Low dynamic range image
# Q: Quality index
NumPixels = LDR.size
if alpha is None:
r = np.floor(NumPixels / (2. ** 18))
if r > 1.:
alpha = 1. - (1. / r)
else:
alpha = 0.
minNonzero = np.min(HDR[HDR > 0])
LogH = np.log(np.maximum(HDR, minNonzero))
# float is needed for further calculation
LogH = np.round((LogH - LogH.min()) * 255. /
(LogH.max() - LogH.min())).astype(np.float)
if alpha > 0.:
PhaseHDR_CH = phasecong100(HDR, 2, 2, 8, 8)
PhaseLDR_CH8 = phasecong100(LDR, 2, 2, 8, 8)
else: # so, if image size is smaller than 512x512?
PhaseHDR_CH = 0
PhaseLDR_CH8 = 0
PhaseLogH = phasecong100(LogH, 2, 2, 2, 2)
PhaseH = alpha * PhaseHDR_CH + (1 - alpha) * PhaseLogH
PhaseLDR_CH2 = phasecong100(LDR, 2, 2, 2, 2)
PhaseL = alpha * PhaseLDR_CH8 + (1 - alpha) * PhaseLDR_CH2
Q = np.sum(np.logical_or(np.logical_and(PhaseL <= 0, PhaseH <= 0),
np.logical_and(PhaseL > 0, PhaseH > 0))) / NumPixels
return Q
@contract(HDR='array[NxM](float)',
LDR='array[NxM](float)',
alpha='float,>=0.0,<=1.0|None')
def FSITMr(HDR, LDR, alpha):
# HDR: High dynamic range image
# LDR: Low dynamic range image
# Q: Quality index
if alpha is None:
r = np.floor(HDR.size / (2. ** 18))
if r > 1.:
alpha = 1. - (1. / r)
else:
alpha = 0.
LogH = HDR - HDR.min()
minNonzero = np.min(LogH[LogH > 0])
LogH = LogH + minNonzero
LogH = np.log(LogH)
if alpha > 0.:
PhaseHDR_CH = phasecong100(HDR, 2, 2, 8, 8)
PhaseLDR_CH8 = phasecong100(LDR, 2, 2, 8, 8)
else:
PhaseHDR_CH = 0
PhaseLDR_CH8 = 0
PhaseLogH = phasecong100(LogH, 2, 2, 2, 2)
PhaseH = alpha * PhaseHDR_CH + (1 - alpha) * PhaseLogH
PhaseLDR_CH2 = phasecong100(LDR, 2, 2, 2, 2)
PhaseL = alpha * PhaseLDR_CH8 + (1 - alpha) * PhaseLDR_CH2
# The original implementation used a <=0, >0 distinction, but
# wouldn't make comparing the sign more sense?
Q = np.sum(np.sign(PhaseL) == np.sign(PhaseH)) / LDR.size
# Q = np.sum(np.logical_and(PhaseL <= 0, PhaseH <= 0) +
# np.logical_and(PhaseL > 0, PhaseH > 0)) / NumPixels
return Q
def img_read(link, gray=False, shape=None, dtype=None, keep=False):
if os.path.exists(link):
if dtype is None:
img = imread(link)
if gray and len(img.shape) > 2:
img = skimage.color.rgb2hsv(img)[..., 2]
else:
W, H = shape
img = np.fromfile(link, dtype=dtype)
if gray:
img = img.reshape(H, W)
else:
img = img.reshape(H, W, -1)
else:
tempfile = wget.download(link, bar=None)
img = img_read(tempfile, gray, shape, dtype)
if not keep:
os.remove(tempfile)
return img.astype(np.float)
# The orininal Matlab code is written by Peter Kovesi.
# This python code is a translation of his Matlab code,
# therefore, it is kind of a derived code.
# So, here is his original copyright notice:
#
# Copyright (c) 1996-2010 Peter Kovesi
# Centre for Exploration Targeting The University of Western Australia
# peter.kovesi@uwa.edu.au
#
# This function is optimized to generate one of the outputs of the 'phasecong3'
# function, please see the original function at:
# http://www.peterkovesi.com/matlabfns/
#
# You can cite this publication:
# @misc {KovesiMATLABCode,
# author = "P. D. Kovesi",
# title = "{MATLAB} and {Octave} Functions
# for Computer Vision and Image Processing",
# note = "Available from: $<$http://www.peterkovesi.com/matlabfns/$>$",
# }
#
# Reimplementation / translation:
# If you want to contact me, e.g. you discovered a bug,
# my name is David Völgyes, and you can write to here:
# david.volgyes@ieee.org
#
# I don't need credit for the reimplementation, you can use it as you want,
# but as a derived product/code, you still should give credit to Peter Kovesi.
#
@contract(im='array(float)',
nscale='int,>0',
norient='int,>0',
minWavelength='int,>1',
sigmaOnf='float,>0')
def phasecong100(im, nscale=2,
norient=2,
minWavelength=7,
mult=2,
sigmaOnf=0.65):
#
# im # Input image
# nscale = 2; # Number of wavelet scales.
# norient = 2; # Number of filter orientations.
# minWaveLength = 7; # Wavelength of smallest scale filter.
# mult = 2; # Scaling factor between successive filters.
# sigmaOnf = 0.65; # Ratio of the standard deviation of the
# # Gaussian describing the log Gabor filter's
# # transfer function in the frequency domain
# # to the filter center frequency.
rows, cols = im.shape
imagefft = fft2(im)
zero = np.zeros(shape=(rows, cols))
EO = dict()
EnergyV = np.zeros((rows, cols, 3))
x_range = np.linspace(-0.5, 0.5, num=cols, endpoint=True)
y_range = np.linspace(-0.5, 0.5, num=rows, endpoint=True)
x, y = np.meshgrid(x_range, y_range)
radius = np.sqrt(x ** 2 + y ** 2)
theta = np.arctan2(- y, x)
radius = ifftshift(radius)
theta = ifftshift(theta)
radius[0, 0] = 1.
sintheta = np.sin(theta)
costheta = np.cos(theta)
lp = lowpass_filter((rows, cols), 0.45, 15)
logGabor = []
for s in range(1, nscale + 1):
wavelength = minWavelength * mult ** (s - 1.)
fo = 1.0 / wavelength
logGabor.append(np.exp(
(- (np.log(radius / fo)) ** 2) / (2 * np.log(sigmaOnf) ** 2)
))
logGabor[-1] *= lp
logGabor[-1][0, 0] = 0
# The main loop...
for o in range(1, norient + 1):
angl = (o - 1.) * np.pi / norient
ds = sintheta * np.cos(angl) - costheta * np.sin(angl)
dc = costheta * np.cos(angl) + sintheta * np.sin(angl)
dtheta = np.abs(np.arctan2(ds, dc))
dtheta = np.minimum(dtheta * norient / 2., np.pi)
spread = (np.cos(dtheta) + 1.) / 2.
sumE_ThisOrient = zero.copy()
sumO_ThisOrient = zero.copy()
for s in range(0, nscale):
filter_ = logGabor[s] * spread
EO[(s, o)] = ifft2(imagefft * filter_)
sumE_ThisOrient = sumE_ThisOrient + np.real(EO[(s, o)])
sumO_ThisOrient = sumO_ThisOrient + np.imag(EO[(s, o)])
EnergyV[:, :, 0] = EnergyV[:, :, 0] + sumE_ThisOrient
EnergyV[:, :, 1] = EnergyV[:, :, 1] + np.cos(angl) * sumO_ThisOrient
EnergyV[:, :, 2] = EnergyV[:, :, 2] + np.sin(angl) * sumO_ThisOrient
OddV = np.sqrt(EnergyV[:, :, 0] ** 2 + EnergyV[:, :, 1] ** 2)
featType = np.arctan2(EnergyV[:, :, 0], OddV)
return featType
# The orininal Matlab code is written by Peter Kovesi.
# This python code is a translation of his Matlab code,
# therefore, it is kind of a derived code.
# So, here is his original copyright notice and usage comment
# (with a variable renaming from 'sze' to 'size'):
# LOWPASSFILTER - Constructs a low-pass butterworth filter.
#
# usage: f = Lowpassfilter(size, cutoff, n)
#
# where: size is a two element vector specifying the size of filter
# to construct [rows cols].
# cutoff is the cutoff frequency of the filter 0 - 0.5
# n is the order of the filter, the higher n is the sharper
# the transition is. (n must be an integer >= 1).
# Note that n is doubled so that it is always an even integer.
#
# 1
# f = --------------------
# 2n
# 1.0 + (w/cutoff)
#
# The frequency origin of the returned filter is at the corners.
#
# See also: HIGHPASSFILTER, HIGHBOOSTFILTER, BANDPASSFILTER
#
# Copyright (c) 1999 Peter Kovesi
# School of Computer Science & Software Engineering
# The University of Western Australia
# http://www.csse.uwa.edu.au/
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# The Software is provided "as is", without warranty of any kind.
# October 1999
# August 2005 - Fixed up frequency ranges for odd and even sized filters
# (previous code was a bit approximate)
# Reimplementation / translation:
# If you want to contact me, my name is David Völgyes,
# and you can write to here: david.volgyes@ieee.org
#
# I don't need credit for the reimplementation, you can use it as you want,
# but as a derived product/code, you still should give credit to Peter Kovesi.
#
# You can cite his publication:
# @misc {KovesiMATLABCode,
# author = "P. D. Kovesi",
# title = "{MATLAB} and {Octave} Functions
# for Computer Vision and Image Processing",
# note = "Available from: $<$http://www.peterkovesi.com/matlabfns/$>$",
# }
@contract(size='int|tuple[2]',
cutoff='float,>0,<=0.5',
n='int,>=1')
def lowpass_filter(size=None, cutoff=None, n=None):
""" Butterworth filter
Examples:
>>> lowpass_filter(3,0.5,2)
array([[1. , 0.5, 0.5],
[0.5, 0.2, 0.2],
[0.5, 0.2, 0.2]])
"""
if type(size) == int:
rows = cols = size
else:
rows, cols = size
x_range = np.linspace(-0.5, 0.5, num=cols)
y_range = np.linspace(-0.5, 0.5, num=rows)
x, y = np.meshgrid(x_range, y_range)
radius = np.sqrt(x ** 2 + y ** 2)
f = ifftshift(1.0 / (1.0 + (radius / cutoff) ** (2 * n)))
return f
if __name__ == "__main__":
if len(sys.argv) == 1:
import doctest
doctest.testmod()
if len(sys.argv) > 1: # there are command line parameters
# these imports are unnecessary if the code is used as a library
from optparse import OptionParser
from scipy.misc import imsave
from imageio import imread
import os.path
import wget
import skimage.color
usage = ("usage: %prog [options] HDR_image LDR_image\n" +
"The images could be files or a http(s)/ftp link.")
parser = OptionParser(usage=usage)
parser.add_option("-t", "--type",
type="string",
dest="maptype",
help="s_map file type (default: float32)",
default="float32")
parser.add_option("-p", "--precision",
type="int",
dest="precision",
help="precision (number of decimals) (default: 4)",
default=4)
parser.add_option("-W", "--width",
type="int",
dest="width",
help="image width (mandatory for RAW files)"
" (default: None)",
default=None)
parser.add_option("-H", "--height",
type="int",
dest="height",
help="image height (mandatory for RAW files)"
" (default: None)",
default=None)
parser.add_option("-i", "--input_type",
type="string",
dest="input",
help="type of the input images: float32/float64"
" for RAW images\n"
"None for regular images opening with scipy"
" (e.g. png) (default: None)",
default=None)
parser.add_option("-g", "--gray",
dest="gray",
action="store_true",
help="gray input (ligthness/brightness)"
" (default: RGB)",
default=False)
parser.add_option("-Q", "--report-Q",
dest="report_q",
action="store_true",
help="report quality index",
default=True)
parser.add_option("-C", "--report-channels",
dest="report_c",
action="store_true",
help="report structural similarity",
default=False)
parser.add_option("-q", "--no-report-Q",
dest="report_q",
action="store_false",
help="do not report quality index")
parser.add_option("-c", "--no-report-channels",
dest="report_s",
action="store_false",
help="do not report structural similarity")
parser.add_option("--quiet",
dest="quiet",
action="store_true",
help="suppress variable names in the report")
parser.add_option("--verbose",
dest="quiet",
action="store_false",
help="use variable names in the report (default)",
default=False)
parser.add_option("--keep",
dest="keep",
action="store_true",
help="keep downloaded files (default: False)",
default=False)
parser.add_option("-r","--revised",
dest="revised",
action="store_true",
help="Enable revised TMQI. (default: Original)")
parser.add_option("-a","--alpha",
dest="alpha",
type="float",
action="store",
help="Alpha value. (default: original implementation)",
default=None)
(options, args) = parser.parse_args()
if len(args) != 2:
print("Exactly two input files are needed: HDR and LDR.")
sys.exit(0)
if options.input is not None:
W, H = options.width, options.height
shape = (W, H)
dtype = np.dtype(options.input)
else:
shape = None
dtype = None
hdr = img_read(args[0], True, shape, dtype, options.keep)
ldr = img_read(args[1], True, shape, dtype, options.keep)
if options.revised:
metric = partial(FSITMr,alpha=options.alpha)
else:
metric = FSITM
result = metric(hdr, ldr)
prec = options.precision
print("S: %s" % result)