我处理这个数据集是用来训练MTCNN网络的,celebA数据集由202599张人脸组成,这里主要是将样本扩容,即在原有的图片上抠图。
因为上面是MTCNN的框架,由P,R,O三个网络组成,P网络输入的是12×12的图片,R网络输入的是24×24的图片,O网络输入的是48×48的图片。
所以要把celebA数据集处理成三种尺寸的图片。
又因为损失函数有两个,一个是置信度的损失,另一个是偏移量的损失。置信度的损失要用正样本和负样本做,偏移量的损失用正样本和部分样本做.
所以每种尺寸图片下面又要有三种不同的样本:正样本(置信度为1,整张人脸),负样本(置信度为0,不包含人脸),部分样本(置信度为2,包含人脸的一部分)。
这是celebA数据集的图片,红框是建议框,判断是正样本、负样本,部分样本的依据是:新生成的图片和建议框的重叠度(iou,iou的计算方法和代码在上一篇博客)
iou>0.4:正样本
0.15<iou<0.4:部分样本
iou<0.15负样本
比如:新生成的图片(蓝框)和红框的重叠度为0.35,认为是部分样本
新生成的图片(蓝框)和红框的重叠度为0.1,认为是负样本
新生成的图片(蓝框)和红框的重叠度为0.8,认为是正样本
celebA图片数据集:
这是两个坐标点的信息(左上角的点和右下角的点)
代码
import os
from PIL import Image
import numpy as np
import utils
import traceback
anno_src = r"G:\数据集\celebA\celebA(分卷形式,一起解压)\celebA\Anno\list_bbox_celeba.txt"
img_dir = r"G:\数据集\celebA\celebA(分卷形式,一起解压)\celebA\img\img_celeba"
save_path = r"H:\dataset"
for face_size in [12,24,48]:
print("gen %i image" % face_size)
# 样本图片存储路径
positive_image_dir = os.path.join(save_path, str(face_size), "positive")
negative_image_dir = os.path.join(save_path, str(face_size), "negative")
part_image_dir = os.path.join(save_path, str(face_size), "part")
for dir_path in [positive_image_dir, negative_image_dir, part_image_dir]:
if not os.path.exists(dir_path):
os.makedirs(dir_path)
# 样本描述存储路径
positive_anno_filename = os.path.join(save_path, str(face_size), "positive.txt")
negative_anno_filename = os.path.join(save_path, str(face_size), "negative.txt")
part_anno_filename = os.path.join(save_path, str(face_size), "part.txt")
positive_count = 0
negative_count = 0
part_count = 0
try:
positive_anno_file = open(positive_anno_filename, "w")
negative_anno_file = open(negative_anno_filename, "w")
part_anno_file = open(part_anno_filename, "w")
for i, line in enumerate(open(anno_src)):
if i < 2:
continue
try:
# strs = line.strip().split(" ")
# strs = list(filter(bool, strs))
strs = line.strip().split()
image_filename = strs[0].strip()
print(image_filename)
image_file = os.path.join(img_dir, image_filename)
with Image.open(image_file) as img:
img_w, img_h = img.size
x1 = float(strs[1].strip())
y1 = float(strs[2].strip())
w = float(strs[3].strip())
h = float(strs[4].strip())
x2 = float(x1 + w)
y2 = float(y1 + h)
px1 = 0#float(strs[5].strip())
py1 = 0#float(strs[6].strip())
px2 = 0#float(strs[7].strip())
py2 = 0#float(strs[8].strip())
px3 = 0#float(strs[9].strip())
py3 = 0#float(strs[10].strip())
px4 = 0#float(strs[11].strip())
py4 = 0#float(strs[12].strip())
px5 = 0#float(strs[13].strip())
py5 = 0#float(strs[14].strip())
#过滤字段
if max(w, h) < 40 or x1 < 0 or y1 < 0 or w < 0 or h < 0:
continue
boxes = [[x1, y1, x2, y2]]#建议框坐标,因为有很多[x1, y1, x2, y2],所以用二维
# 计算出人脸中心点位置
cx = x1 + w / 2
cy = y1 + h / 2
# 使正样本和部分样本数量翻倍
for _ in range(5):
# 让人脸中心点有少许的偏移
w_ = np.random.randint(-w * 0.5, w * 0.5)
h_ = np.random.randint(-h * 0.5, h * 0.5)
cx_ = cx + w_
cy_ = cy + h_
# 让人脸形成正方形,并且让坐标也有少许的偏离
side_len = np.random.randint(int(min(w, h) * 0.8), np.ceil(1.25 * max(w, h)))#新框边长
x1_ = np.max(cx_ - side_len / 2, 0)
y1_ = np.max(cy_ - side_len / 2, 0)
x2_ = x1_ + side_len
y2_ = y1_ + side_len
crop_box = np.array([x1_, y1_, x2_, y2_])#新框坐标
# 计算坐标的偏移值
offset_x1 = (x1 - x1_) / side_len
offset_y1 = (y1 - y1_) / side_len
offset_x2 = (x2 - x2_) / side_len
offset_y2 = (y2 - y2_) / side_len
offset_px1 = 0#(px1 - x1_) / side_len
offset_py1 = 0#(py1 - y1_) / side_len
offset_px2 = 0#(px2 - x1_) / side_len
offset_py2 = 0#(py2 - y1_) / side_len
offset_px3 = 0#(px3 - x1_) / side_len
offset_py3 = 0#(py3 - y1_) / side_len
offset_px4 = 0#(px4 - x1_) / side_len
offset_py4 = 0#(py4 - y1_) / side_len
offset_px5 = 0#(px5 - x1_) / side_len
offset_py5 = 0#(py5 - y1_) / side_len
# 剪切下图片,并进行大小缩放
face_crop = img.crop(crop_box)#crop抠图
face_resize = face_crop.resize((face_size, face_size))
iou = utils.iou(crop_box, np.array(boxes))[0]
if iou > 0.4: # 正样本
positive_anno_file.write(
"positive/{0}.jpg {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15}\n".format(
positive_count, 1, offset_x1, offset_y1,
offset_x2, offset_y2, offset_px1, offset_py1, offset_px2, offset_py2, offset_px3,
offset_py3, offset_px4, offset_py4, offset_px5, offset_py5))
positive_anno_file.flush()
face_resize.save(os.path.join(positive_image_dir, "{0}.jpg".format(positive_count)))
positive_count += 1
elif 0.15<iou < 0.4: # 部分样本
part_anno_file.write(
"part/{0}.jpg {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15}\n".format(
part_count, 2, offset_x1, offset_y1,offset_x2,
offset_y2, offset_px1, offset_py1, offset_px2, offset_py2, offset_px3,
offset_py3, offset_px4, offset_py4, offset_px5, offset_py5))
part_anno_file.flush()
face_resize.save(os.path.join(part_image_dir, "{0}.jpg".format(part_count)))
part_count += 1
elif iou < 0.15:
negative_anno_file.write(
"negative/{0}.jpg {1} 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n".format(negative_count, 0))
negative_anno_file.flush()
face_resize.save(os.path.join(negative_image_dir, "{0}.jpg".format(negative_count)))
negative_count += 1
# 生成负样本,因为按照上面的方法负样本数量不够
_boxes = np.array(boxes)
for i in range(5):
side_len = np.random.randint(face_size, min(img_w, img_h) / 2)
x_ = np.random.randint(0, img_w - side_len)
y_ = np.random.randint(0, img_h - side_len)
crop_box = np.array([x_, y_, x_ + side_len, y_ + side_len])
if np.max(utils.iou(crop_box, _boxes)) < 0.15:
face_crop = img.crop(crop_box)
face_resize = face_crop.resize((face_size, face_size), Image.ANTIALIAS)
negative_anno_file.write("negative/{0}.jpg {1} 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n".format(negative_count, 0))
negative_anno_file.flush()
face_resize.save(os.path.join(negative_image_dir, "{0}.jpg".format(negative_count)))
negative_count += 1
except Exception as e:
traceback.print_exc()
finally:
positive_anno_file.close()
negative_anno_file.close()
part_anno_file.close()
结果
样本的txt文件,里面是:图片名,置信度,四个偏移量