1.复习了几个numpy函数,对参数里面的维度加深理解
1.np.mean(),求平均,若axis为元组,说明是在元组说明的维度上进行求均值
e.g.
mean_sal = np.mean(pred,(1,2))
其中pred为[length,height,width]三维度的矩阵,上述代码操作为对每个图像矩阵进行求均值,最终得到mean_sal变为[length]数组
2.np.multiply(),是每个对应位置的元素相乘,可用于求{0,1}值的两张图都为一的像素点数量(外面加一个np.sum)
3.np.sum(),求和
4.np.logic_and(),求逻辑与,用于求TP,FP,TN数据
5.np.equal(X, number),返回维度相同的True和False图,条件为每个元素是否等于number
2.掌握了显著性目标检测中两个常用的性能评估指标MAE与F-measure的实现方法
MAE:
def get_mae(pred, gt, in_type='numpy'):
"""Get average MAE for evaluation.
:param pred: predicted saliency map, if in_type='file', pred is the path to images,
if in_type='numpy' it should be a 3D ndarray having dtype np.float and of shape [length, height, width].
:param gt: ground truth saliency map, if in_type='file', gt is the path to images
if in_type='numpy' it should be a 3D ndarray having dtype np.uint8 and of shape [length, height, width].
:param in_type: value in {'file', 'numpy'}, default set to 'numpy'.
:return: one float value, indicating average MAE for evaluation.
"""
if in_type == 'file':
pred_names = os.listdir(pred)
pred_list = []
for name in pred_names:
img = np.array(Image.open(name, 'r'))
img = img.astype(np.float)
pred_list.append(img)
pred_list = np.array(pred_list)
gt_names = os.listdir(gt)
gt_list = []
for name in gt_names:
gt = np.array(Image.open(name, 'r'))
gt = img.astype(np.float)
gt_list.append(gt)
gt_list = np.array(gt_list)
elif in_type == 'numpy':
assert pred.shape == gt.shape
pred_list = pred
gt_list = gt
else:
raise ValueError('invalid in_type')
# 将预测结果和ground truth加载到列表中,并转化为numpy数组格式
mae = np.mean(
np.absolute(
pred_list.astype("float") - gt_list.astype("float")))
# calculate mean absolute error
return mae
F_measure:
def get_f_measure(pred, gt, in_type='numpy', beta_square=0.3, threshold='adaptive'):
"""Get average F measure for evaluation.
:param pred: predicted saliency map, if in_type='file', pred is the path to images,
if in_type='numpy' it should be a 3D ndarray having dtype np.float and of
shape [length, height, width].
:param gt: ground truth saliency map, if in_type='file', gt is the path to images
if in_type='numpy' it should be a 3D ndarray having dtype np.uint8 and of
shape [length, height, width].
:param in_type: value in {'file', 'numpy'}, default set to 'numpy'.
:param beta_square: value for $beta^2$ in f measure calculation, default set to 0.3.
:param threshold: 'adaptive' or 1D ndarray of float value(not recommended), default set to 'adaptive',
which means threshold will be set to twice the mean saliency value of each saliency map.
:return:one float value, indicating average F measure.
"""
if in_type == 'file': # if in_type is file, convert images to ndarray
pred_names = os.listdir(pred)
pred_list = []
for name in pred_names:
img = np.array(Image.open(name, 'r'))
img = img.astype(np.float)
pred_list.append(img)
pred_list = np.array(pred_list)
# 最后将数据结构转为np.array 格式
gt_names = os.listdir(gt)
gt_list = []
for name in gt_names:
gt = np.array(Image.open(name, 'r'))
gt = gt.astype(np.float)
gt_list.append(gt)
gt_list = np.array(gt_list)
elif in_type == 'numpy':
assert pred.shape == gt.shape
pred_list = pred
gt_list = gt
else:
raise ValueError('invalid in_type')
if threshold == 'adaptive':
length = pred_list.shape[0] # number of images
# 1D ndarray, containing mean saliency value for each image
mean_sal = np.mean(pred_list, (1, 2))
# 在多个维度上进行求均值,这里指的是在每个图像上进行求均值,每个图象一个均值
# transpose to shape [height, width, length]
pred_list = pred_list.transpose((1, 2, 0))
# numpy propagation [height, width, length] - [length]
pred_list -= 2 * mean_sal
# 减去阈值,每张图片减去自己图片上的均值
if type(threshold) is float:
pred_list -= threshold
eps = np.finfo(float).eps # eps can be added to denominator to avoid dividing zero.
# eps是一个很小的数,为了避免分母为0的现象出现
gt_list = gt_list.transpose((1, 2, 0)) # transpose to shape [height, width, length]
pred_list = pred_list.reshape([-1, length]) # flatten images to [height*width, length]
gt_list = gt_list.reshape([-1, length]) # flatten ground truth maps to [height*width, length]
pred_list[pred_list > 0] = 1
pred_list[pred_list <= 0] = 0
# 筛选哪些为正,哪些为负
pred_list = pred_list.astype(np.uint8)
TP = np.sum(np.multiply(gt_list, pred_list), 0) # true positive rate,两者取交集(对应位置相乘)
FP = np.sum(np.logical_and(np.equal(gt_list, 0), np.equal(pred_list, 1)), 0) # false positive rate,在0维度上进行(每张图片)
TN = np.sum(np.logical_and(np.equal(gt_list, 1), np.equal(pred_list, 0)), 0) # true negative rate
precision = TP / (TP + FP + eps)
recall = TP / (TP + TN + eps)
f_measure = ((1 + beta_square)*precision*recall) / (beta_square*precision + recall + eps)
f_measure = np.mean(f_measure)
return f_measure
F-measure核心在于求TP,FP,TN三个元素。
代码中对涉及到除法的分母都加上了一个很小的数eps,避免出现除零的情况,这个小细节值得留意。