前言
SSD代码详解数据篇,旨在全方位介绍数据从下载到数据增强,最后封装为pytorch的data_loader过程。
其中,涉及了目标检测领域绝大部分的数据增强方式,亮度、对比度、色调、裁剪、扩充等等方法。
结合之前的【SSD算法】史上最全代码解析-核心篇,相信针对SSD算法,一定能够有一个全新、全面的认识,同时也有助于对其他检测算法的学习和理解,比较基础的东西是一样的,只是在算法的设计和网络的设计有所不同。
⛳️ 两篇一起阅读,必然效果更好,希望能对大家能有所帮助!
目录
- 下载数据
- 数据dataset
- 数据增强
- 1. 数据类型转换
- 2. Transform Compose
- 3. IOU计算
- 4. bbox坐标变化
- 5. 图片 Resize
- 6. 图片色彩转换
- 7. 色调Hue变化
- 8. 饱和度变化
- 9. 亮度变化
- 10. 对比度变化
- 11. 颜色通道变化
- 12. 图片镜像
- 13. 图片随机裁剪
- 14. 图片扩充
- 汇总
下载数据
进入到自己的data文件夹,执行下面的脚本即可下载并解压好 VOC2017 & VOC2012
的数据。
脚本代码:
cd ./data
echo "Downloading VOC2007 trainval ..."
# 下载数据
curl -LO http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
echo "Downloading VOC2007 test data ..."
curl -LO http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
echo "Done downloading."
# 解压数据
echo "Extracting trainval ..."
tar -xvf VOCtrainval_06-Nov-2007.tar
echo "Extracting test ..."
tar -xvf VOCtest_06-Nov-2007.tar
echo "removing tars ..."
# 删除压缩包
rm VOCtrainval_06-Nov-2007.tar
rm VOCtest_06-Nov-2007.tar
然后稍微整理,把VOC2007, VOC2012放在同一个目录下,如下:
├── data
│ ├── VOC
│ ├── VOCdevkit
│ ├── VOC2017
│ ├── VOC2012
数据dataset
1. Annotation Tranform:
✔️ VOCAnnotationTransform()
需要把VOC的xml数据提取并且转化,提取bbox坐标进行归一化,并且把类别转化为字典格式,最后把数据组合为: [[xmin, ymin, xmax, ymax, label_ind],...]
- VOC数据类别和目录:
VOC_CLASSES = (
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat', 'chair',
'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor')
VOC_ROOT = "E:/TZ_WK/VOC/VOCdevkit"
VOCAnnotationTransform
:
class VOCAnnotationTransform(object):
"""
把VOC的annotation中bbox的坐标转化为归一化的值;
将类别转化为用索引来表示的字典形式;
Args:
class_to_ind: (dict)类别的索引字典
keep_difficult: 是否保留difficult=1的物体
"""
def __init__(self, class_to_ind=None, keep_difficult=False):
self.class_to_ind = class_to_ind or dict(
zip(VOC_CLASSES, range(len(VOC_CLASSES))))
self.keep_difficult = keep_difficult
def __call__(self, target, width, height):
"""
Args:
target: xml被读取的一个ET.Element
width: 图片宽度
height: 图片高度
Return:
res: list, [bbox coords, class name]
-->eg: [[xmin, ymin, xmax, ymax, label_ind],...]
"""
res = []
for obj in target.iter('object'):
# 判断difficult
difficult = int(obj.find('difficult').text) == 1
if not self.keep_difficult and difficult:
continue
# 读取xml中所需的信息
name = obj.find('name').text.lower().strip()
bbox = obj.find('bndbox')
# bbox的表示
pts = ['xmin', 'ymin', 'xmax', 'ymax']
bndbox = []
for i, pt in enumerate(pts):
cur_pt = int(bbox.find(pt).text) - 1
# 归一化,x/w, y/h
cur_pt = cur_pt / width if i % 2 == 0 else cur_pt / height
bndbox.append(cur_pt)
# 提取类别名称对应的 index
label_idx = self.class_to_ind[name]
bndbox.append(label_idx)
res += [bndbox]
return res
# 代码调试
if __name__ == "__main__":
vocan = VOCAnnotationTransform()
res = vocan(target, width, height)
print('The transform res:')
print(res)
输出:
The transform res:
[[0.13314447592067988, 0.478, 0.5495750708215298, 0.74, 11],
[0.019830028328611898, 0.022, 0.9943342776203966, 0.994, 14]]
2. VOC Detection Dataset:
✔️ 根据Annotation Transform 和 VOC的数据结构,读取图片, bbox和label,构建VOC的数据集。
class VOCDetection(data.Dataset):
def __init__(self, root,
image_sets = [('2007', 'trainval'), ('2012', 'trainval')],
transform = None,
target_transform = VOCAnnotationTransform(),):
self.root = root
self.image_set = image_sets
self.transform = transform
self.target_transform = target_transform
# bbox和label
self._annopath = os.path.join('%s', 'Annotations', '%s.xml')
# 图片path
self._imgpath = os.path.join('%s', 'JPEGImages', '%s.jpg')
self.ids = list()
for (year, name) in image_sets:
rootpath = os.path.join(self.root, 'VOC' + year)
for line in open(os.path.join(rootpath, 'ImageSets', 'Main', name + '.txt')):
self.ids.append((rootpath, line.strip()))
def __getitem__(self, index):
img_id = self.ids[index]
# label 信息
target = ET.parse(self._annopath % img_id).getroot()
# 读取图片信息
img = cv2.imread(self._imgpath % img_id)
h, w, c = img.shape
# Annotation transform
if self.target_transform is not None:
target = self.target_transform(target, w, h)
# transform, 数据增强
if self.transform is not None:
target = np.array(target)
# transform
img, boxes, labels = self.transform(img, target[:, :4], target[:, 4])
# 把图片转化为RGB
img = img[:, :,(2, 1, 0)]
# 把 bbox和label合并为 shape(N, 5)
target = np.hstack(boxes, np.expand_dims(labels, axis=1))
else:
target = np.array(target)
return torch.from_numpy(img).permute(2, 0, 1), target, h, w
def __len__(self):
return len(self.ids)
调试代码:
Data = VOCDetection(VOC_ROOT)
data_loader = data.DataLoader(Data, batch_size=1,
num_workers=0,
shuffle=True,
pin_memory=True)
print('the data length is:', len(data_loader))
# 类别 to index
class_to_ind = dict(zip(VOC_CLASSES, range(len(VOC_CLASSES))))
# index to class,转化为类别名称
ind_to_class = ind_to_class ={v:k for k, v in class_to_ind.items()}
# 加载数据
for datas in data_loader:
img, target,h, w = datas
img = img.squeeze(0).permute(1,2,0).numpy().astype(np.uint8)
target = target[0].float()
# 把bbox的坐标还原为原图的数值
target[:,0] *= w.float()
target[:,2] *= w.float()
target[:,1] *= h.float()
target[:,3] *= h.float()
# 取整
target = np.int0(target.numpy())
# 画出图中类别名称
for i in range(target.shape[0]):
# 画矩形框
img =cv2.rectangle(img, (target[i,0],target[i,1]),(target[i, 2], target[i, 3]), (0,0,255), 2)
# 标明类别名称
img =cv2.putText(img, ind_to_class[target[i,4]],(target[i,0], target[i,1]-25),
cv2.FONT_HERSHEY_SIMPLEX, .5, (255, 255, 0), 1)
# 显示
cv2.imshow('imgs', img)
cv2.waitKey(0);
cv2.destroyAllWindows()
break
输出
the data length is: 16551
![v2-a04169ceb23d43b9def054c1883c4a50_b.jpg](http://img-03.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic1.zhimg.com/v2-a04169ceb23d43b9def054c1883c4a50_b.jpg)
数据增强
1. 数据类型转换
✔️ 在针对图像进行变化的过程中,需要把图片的 uint8 格式转化为 np.float32,方便计算。
class ConvertFromInts(object):
"""
把图片的uint8转化为float型
"""
def __call__(self, image, boxes=None, labels=None):
return image.astype(np.float32), boxes, labels
2. Transform Compose
✔️ 我们有很多图片增强的方式,比如对比度,亮度,色度等等,因此会有很多的transform, Compose()
函数的作用是把这些transform合并在一起。
class Compose(object):
"""
把不同的数据增强方法组合在一起
Args:
transforms: (list[Transform]):transforms的列表
Example:
>>> augmentations.Compose([
>>> transforms.CenterCrop(10),
>>> transforms.ToTensor(),])
"""
def __init__(self, transform):
self.transform = transform
def __call__(self, img, boxes=None, labels=None):
for t in self.transform:
img, boxes, labels = t(img, boxes, labels)
return img, boxes, labels
3. IOU计算
✔️ 在进行裁剪图片的时候,我们需要考虑裁剪框和图片bbox的iou,这样确保裁剪出的都是有效区域。
def iou_numpy(box_a, box_b):
'''
计算一个框和一些框之间的iou值;
Args:
box_a: 多个bounding boxes,shape[N,4]
box_b: 裁剪矩形,单个bounding box, shape[4]
Reture:
iou: shape[N]
'''
lt = np.maximum(box_a[:, :2], box_b[:2])
rb = np.minimum(box_a[:, 2:], box_b[2:])
wh = np.clip((rb - lt), a_min=0, a_max=np.inf)
inter = wh[:, 0]*wh[:, 1]
area_a = ((box_a[:, 2] - box_a[:, 0]) *
(box_a[:, 3] - box_a[:, 1]))
area_b = ((box_b[2] - box_b[0]) *
(box_b[3] - box_b[1]))
iou = area_a + area_b - inter
return iou
4. bbox坐标变化
✔️ 在图片增强的过程中,有时候需要原图的绝对坐标,确保bbox的变化,有时候需要归一化后的坐标,例如在resize时候。
- 归一化 --> 原图 size
class ToAbsoluteCoords(object):
"""
把归一化后的box变回原图
"""
def __call__(self, image, boxes=None, labels=None):
h, w, c = image.shape
boxes[:, 0] *= w
boxes[:, 2] *= w
boxes[:, 1] *= h
boxes[:, 3] *= h
return image, boxes, labels
- 原图 size --> 归一化
class ToPercentCoords(object):
"""
把原图的box进行归一化
"""
def __call__(self, image, boxes=None, labels=None):
h, w, c = image.shape
boxes[:, 0] = boxes[:, 0] / w
boxes[:, 2] = boxes[:, 2] / w
boxes[:, 1] = boxes[:, 1] / h
boxes[:, 3] = boxes[:, 3] / h
return image, boxes, labels
5. 图片 Resize
✔️ 输入的图片大小各异,在输入网络前,需要进行统一的resize。
class Resize(object):
"""
图片 Resize
"""
def __init__(self, size=300):
self.size = size
def __call__(self, image, boxes=None, labels=None):
image = cv2.resize(image, (self.size, self.size))
return image, boxes, labels
![v2-bd99f4c9754ba291814ecdb040accf4c_b.jpg](http://img-01.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic1.zhimg.com/v2-bd99f4c9754ba291814ecdb040accf4c_b.jpg)
6. 图片色彩转换
✔️ 在进行亮度,饱和度等变化时,需要把色彩空间转换为HSV。
class ConvertColor(object):
"""
BGR 和 HSV 之间的转换
"""
def __init__(self, current='BGR', transform='HSV'):
self.current = current
self.transform = transform
def __call__(self, image, boxes=None, labels=None):
# BGR TO HSV
if self.current == 'BGR' and self.transform =='HSV':
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
# HSV TO BGR
elif self.current == 'HSV' and self.transform == 'BGR':
image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)
else:
raise NotImplementedError
return image, boxes, labels
7. 色调Hue变化
- Hue变化需要在 HSV 空间下,改变H的数值;
- 图像IPL_DEPTH_32F类型时,H取值范围是0-360
class RandomHue(object):
"""
随机变换色度(在np.float32 type 和 HSV 空间下, H的范围(0, 360));
需要输入图片格式为HSV;
"""
def __init__(self, delta=18.0):
assert delta >= 0.0 and delta <= 360.0
self.delta = delta
def __call__(self, image, boxes=None, labels=None):
if random.randint(2):
print('hue')
# 改变 h的值
image[:, :, 0] += random.uniform(-self.delta, self.delta)
# 已知 h 的范围是在 (0, 360)之间
image[:, :, 0][image[:, :, 0] > 360.0] -= 360.0
image[:, :, 0][image[:, :, 0] < 0.0] += 360.0
return image, boxes, labels
![v2-471894501aa41d1def827ec4734c10aa_b.jpg](http://img-01.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic3.zhimg.com/v2-471894501aa41d1def827ec4734c10aa_b.jpg)
8. 饱和度变化
- 饱和度变化需要在 HSV 空间下,改变S的数值;
- 图像IPL_DEPTH_32F类型时,S取值范围是0-1
class RandomSaturation(object):
"""
随机饱和度变化,需要输入图片格式为HSV
"""
def __init__(self, lower=0.5, upper=1.5):
self.lower = lower
self.upper = upper
assert self.upper >= self.lower, "contrast upper must be >= lower."
assert self.lower >= 0, "contrast lower must be non-negative."
def __call__(self, image, boxes=None, labels=None):
if random.randint(2):
print('saturation')
image[:, :, 1] *= random.uniform(self.lower, self.upper)
# 已知 S 的范围是在 (0, 1)之间
image[:, :, 1] = np.clip(image[:, :, 1], 0., 1.0)
return image, boxes, labels
![v2-9edd605bd7807542ac71b46a9a6188ae_b.jpg](http://img-02.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic3.zhimg.com/v2-9edd605bd7807542ac71b46a9a6188ae_b.jpg)
9. 亮度变化
- 图片的亮度变化,只需要在RGB空间下,加上一个delta值;
- 要设置变化后数值在0~255之间
class RandomBrightness(object):
"""
图片亮度的随机变化;
变化公式:img(x) = img(x)+b
"""
def __init__(self, delta=32):
assert delta >= 0.0
assert delta <= 255.0
self.delta = delta
def __call__(self, image, boxes=None, labels=None):
if random.randint(2):
delta = random.uniform(-self.delta, self.delta)
image += delta
# 限制image的范围[0, 255.0]
image = np.clip(image, 0, 255)
return image, boxes, labels
![v2-a736dc415c48dafef5c8ab5150d58eea_b.jpg](http://img-03.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic3.zhimg.com/v2-a736dc415c48dafef5c8ab5150d58eea_b.jpg)
10. 对比度变化
- 图片的对比度变化,只需要在RGB空间下,乘上一个alpha值;
- 要设置变化后数值在0~255之间
class RandomContrast(object):
"""
图片对比度的随机变化;
变化公式:img(x) = a*img(x)
"""
def __init__(self, lower=0.5, upper=1.5):
self.lower = lower
self.upper = upper
assert self.upper >= self.lower, "contrast upper must be >= lower."
assert self.lower >= 0, "contrast lower must be non-negative."
# expects float image
def __call__(self, image, boxes=None, labels=None):
if random.randint(2):
alpha = random.uniform(self.lower, self.upper)
image *= alpha
# 限制image的范围[0, 255.0]
image = np.clip(image, 0, 255)
return image, boxes, labels
![v2-be9cbb6df6c6db684f2ad49ba2f23059_b.jpg](http://img-02.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic2.zhimg.com/v2-be9cbb6df6c6db684f2ad49ba2f23059_b.jpg)
11. 颜色通道变化
✔️ 针对图片的RGB空间,随机调换各通道的位置,实现不同灯光效果
class SwapChannels(object):
"""
图片通道变换
Args:
swaps: (int triple),变化的通道元组
eg: (2, 1, 0)
"""
def __init__(self, swaps):
self.swaps = swaps
def __call__(self, image):
image = image[:, :, self.swaps]
return image
class RandomLightingNoise(object):
"""
图片更换通道,形成的颜色变化
"""
def __init__(self):
self.perms = ((0, 1, 2), (0, 2, 1),
(1, 0, 2), (1, 2, 0),
(2, 0, 1), (2, 1, 0))
def __call__(self, image, boxes=None, labels=None):
if random.randint(2):
print('RandomLightingNoise')
swap = self.perms[random.randint(len(self.perms))]
shuffle = SwapChannels(swap)
image = shuffle(image)
return image, boxes, labels
![v2-64017843faa538b46829138805549937_b.jpg](http://img-02.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic4.zhimg.com/v2-64017843faa538b46829138805549937_b.jpg)
12. 图片镜像
✔️ 图片镜像是指图片的左右翻转,实现图片增广。
class RandomMirror(object):
"""
随机镜像图片
"""
def __call__(self, image, boxes, labels):
w = image.shape[1]
if random.randint(2):
# 图片翻转
image = image[:, ::-1]
# boxes的坐标也需要相应改变
boxes = boxes.copy()
boxes[:, 0::2] = w - boxes[:, 2::-2]
return image, boxes, labels
![v2-115257981c815fce8e8ab01f638edbc5_b.jpg](http://img-01.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic2.zhimg.com/v2-115257981c815fce8e8ab01f638edbc5_b.jpg)
![v2-31a3c5640bb9b4d78b59acb44ce6ccf8_b.jpg](http://img-02.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic1.zhimg.com/v2-31a3c5640bb9b4d78b59acb44ce6ccf8_b.jpg)
13. 图片随机裁剪
✔️ 图片的随机裁剪在图片增强有着很大的应用,在考虑图片裁剪的过程中,裁剪的过程为:
- 随机选取裁剪框的大小;
- 根据大小确定裁剪框的坐标;
- 分析裁剪框和图片内部bounding box的iou;
- 筛选掉iou不符合要求的裁剪框
- 裁剪图片,重新更新bounding box 的位置坐标
![v2-e8427caa605f609be1789d1d4350fb6f_b.jpg](http://img-03.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic4.zhimg.com/v2-e8427caa605f609be1789d1d4350fb6f_b.jpg)
class RandomSampleCrop(object):
"""
随机切割图片
"""
def __init__(self):
self.sample_options = (
# 原图
None,
# min_iou 和 max_iou
(0.1, None),
(0.3, None),
(0.7, None),
(0.9, None),
# randomly sample a patch
(None, None),
)
def __call__(self, image, boxes=None, labels=None):
print('crop now...')
h, w, _ = image.shape
while True:
mode = random.choice(self.sample_options)
if mode is None:
return image, boxes, labels
min_iou, max_iou = mode
if min_iou is None:
min_iou = float('-inf')
if max_iou is None:
max_iou = float('inf')
# 迭代 n 次
for i in range(50):
current_image = image
ww = random.uniform(0.3 * w, w)
hh = random.uniform(0.3 * h, h)
# 判断长宽比在一定范围
if hh / ww < 0.5 or hh / ww > 2:
continue
left = random.uniform(w - ww)
top = random.uniform(h - hh)
# 切割的矩形大小
rect = np.array([int(left), int(top), int(left+ww), int(top+hh)])
# 计算切割的矩形和 gt 框的iou大小
overlap = iou_numpy(boxes, rect)
# 筛选掉不满足 overlap条件的
if overlap.min() < min_iou and max_iou < overlap.max():
continue
current_image = current_image[rect[1]:rect[3], rect[0]:rect[2]]
centers = (boxes[:, :2] + boxes[:, 2:]) / 2.0
# 切割矩形 在所有的 gt box的中心点的左上方
m1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1])
# 切割矩形 在所有的 gt box的中心点的右下方
m2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1])
mask = m1 * m2
if not mask.any():
continue
current_boxes = boxes[mask, :].copy()
current_labels = labels[mask]
# 获取box和切割矩形的交点(左上角) A点
current_boxes[:, :2] = np.maximum(current_boxes[:, :2],
rect[:2])
# 调节坐标系,让boxes的左上角坐标变为切割后的坐标
current_boxes[:, :2] -= rect[:2]
current_boxes[:, 2:] = np.minimum(current_boxes[:, 2:],
rect[2:])
# 调节坐标系,让boxes的左上角坐标变为切割后的坐标
current_boxes[:, 2:] -= rect[:2]
return current_image, current_boxes, current_labels
![v2-da184fc0e2639514e26e77724dc03be3_b.jpg](http://img-03.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic4.zhimg.com/v2-da184fc0e2639514e26e77724dc03be3_b.jpg)
14. 图片扩充
✔️ 设置一个大于原图Size的随机size,填充指定的像素值,然后把原图随机放入这个图片中,实现原图的扩充。
class Expand(object):
"""
随机扩充图片,expand
"""
def __init__(self, mean):
self.mean = mean
def __call__(self ,image, boxes, labels):
if random.randint(2):
return image, boxes, labels
h, w, c = image.shape
ratio = random.uniform(1, 4)
left = random.uniform(0, w*ratio - w)
top = random.uniform(0, h*ratio - h)
expand_image = np.zeros((int(h*ratio), int(w*ratio), c),
dtype=image.dtype)
# 填充 mean值
expand_image[:,:,:] = self.mean
# 放入原图
expand_image[int(top):int(top+h), int(left):int(left+w)] = image
image = expand_image
# 同样相应的变化boxes的坐标
boxes = boxes.copy()
boxes[:, :2] += (int(left), int(top))
boxes[:, 2:] += (int(left), int(top))
return image, boxes, labels
![v2-886f144c0ff68c369523efd71fa90f11_b.jpg](http://img-03.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic2.zhimg.com/v2-886f144c0ff68c369523efd71fa90f11_b.jpg)
汇总
✔️ 最后,根据上面所有的方法,合并为数据增强的一个python类。
class PhotometricDistort(object):
"""
图片亮度,对比度和色调变化的方式合并为一个类
"""
def __init__(self):
self.pd = [
RandomContrast(),
ConvertColor(transform='HSV'),
RandomSaturation(),
RandomHue(),
ConvertColor(current='HSV', transform='BGR'),
RandomContrast()
]
self.rand_brightness = RandomBrightness()
self.rand_light_noise = RandomLightingNoise()
def __call__(self, image, boxes, labels):
im = image.copy()
im, boxes, labels = self.rand_brightness(im, boxes, labels)
if random.randint(2):
distort = Compose(self.pd[:-1])
else:
distort = Compose(self.pd[1:])
im, boxes, labels = distort(im, boxes, labels)
return self.rand_light_noise(im, boxes, labels)
# 结合所有的图片增广方法形成的类
class SSDAugmentation(object):
def __init__(self, size=300, mean=(104, 117, 123)):
self.mean = mean
self.size = size
self.augment = Compose([
ConvertFromInts(), # 转化为float32
ToAbsoluteCoords(), # 转化为原图坐标
PhotometricDistort(), # 图片增强方式
Expand(self.mean), # 扩充
RandomSampleCrop(), # 裁剪
RandomMirror(), # 镜像
ToPercentCoords(), # 转化为归一化后的坐标
Resize(self.size), # Resize
ToAbsoluteCoords(), # 转为原图坐标
#SubtractMeans(self.mean), # 减去均值
])
def __call__(self, image, boxes, labels):
return self.augment(image, boxes, labels)
输出样图:
![v2-80515ea3387357ab497a0787b405be8a_b.jpg](http://img-01.proxy.5ce.com/view/image?&type=2&guid=bd3bb71d-502f-eb11-8da9-e4434bdf6706&url=https://pic3.zhimg.com/v2-80515ea3387357ab497a0787b405be8a_b.jpg)