python3.5+Tensorflow+Faster R-CNN在ubuntu下训练数据，进行表面缺陷检测(一)

最新推荐文章于 2024-06-06 10:31:45 发布

wbgan1994

最新推荐文章于 2024-06-06 10:31:45 发布

阅读量3.5k

点赞数 7

分类专栏：机器学习/深度学习 TensorFlow 文章标签：表面缺陷检测 faster-rcnn voc2007数据制作

本文链接：https://blog.csdn.net/ganwenbo2011/article/details/89713124

版权

机器学习/深度学习同时被 2 个专栏收录

7 篇文章 0 订阅

订阅专栏

TensorFlow

5 篇文章 0 订阅

订阅专栏

环境：ubuntu18.04+python3.5(我用的anaconda3)+Tensorflow+gtx1060+cuda9.0+cudnn7.3

至于环境的搭建，就不赘述，网上很多教程。环境的搭建也是够坑的，我重装了n次。主要是ubuntu对N卡的支持问题，导致开机卡在登陆界面。如果出现开机卡紫屏，参考我另外一篇博文安装显卡驱动Ubuntu18 开机卡紫屏，n卡驱动在线安装

我做的是铆钉的缺陷检测，数据集是自己手动标定的，源图像510张（一张张标定累死我了！），因为考虑到数据量偏少，然后进行了数据增强，分别了进行水平翻转，平移，添加高斯噪声，扩充到2K+张。

一、制作VOC2007数据集（前期准备与平台无关，可在windows下完成）

0.将原图像复制保存，用于增强（如果数据集够大，就不必数据增强）。（单张图片最好不要太大，适当resize缩小。我的就太大了，每张1MB以上）

我做了3个变换，分别为翻转、平移、添加噪声操作，所以复制了3次。从510张扩充到2040张。

1.图像命名为6位数字，000000.jpg-002039.jpg（普遍都是这么命名的，我们也没必要标新立异）。注意图像格式，别傻傻的把.png格式直接重命名为.jpg哦，需要转换格式才行。转格式简单，直接用cv2.imread()，再cv2.imwrite()保存为.jpg格式就好。


def imgRename(path):
    i=0
    """图像重命名"""
    imgList=[os.path.join(path, f) for f in os.listdir(path) if f.endswith('.jpg')]
    for i in range(len(imgList)):
        print(i)
        name=os.path.splitext(imgList[i])[0];

        newname=os.path.join(path,str('%06d' % i)+'.jpg')
        # print(newname)
        os.rename(imgList[i], newname)

1.创建文件夹，分别保存510张图像。如resource保存0-509，imgflip保存510-1019，Image_shift保存1020-1529，gasuss_noise保存1530-2039。

2.对resource文件夹下原图像进行数据标注。（增强后图像的标记通过程序进行获得。如果纯手工标定2k+张，会累死的。）,我是在window下完成数据标定的，图像标注工具labelImg安装方法。

3.创建resource-Annotations文件夹，标注保存，每张图片会生成对应的xml文件。

<annotation>
	<folder>resource-Annotations</folder>
	<filename>000021.jpg</filename>
	<path>L:\DataSet\20190311\resource-Annotations\000021.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>2550</width>
		<height>2488</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>cashang</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>608</xmin>
			<ymin>1963</ymin>
			<xmax>883</xmax>
			<ymax>2209</ymax>
		</bndbox>
	</object>
	<object>
		<name>cashang</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>895</xmin>
			<ymin>2122</ymin>
			<xmax>1111</xmax>
			<ymax>2296</ymax>
		</bndbox>
	</object>
</annotation>

3.生成的xml文件需要修改几个标签。主要是文件路径的问题，后面训练时候要从这个路径下读图像。

from xml.etree import ElementTree as ET
from xml.etree.ElementTree import Element,ElementTree

def processXml(path):
    tree = ET.parse(path)
    root = tree.getroot()
    childs = root.getchildren()
    childs[0].text='VOC2007'
    childs[1].text='./VOC2007/JPEGImages/000000.jpg'
    childs3=childs[3].getchildren()
    childs3[0].text='pascalvoc'

    tree.write(path, 'UTF-8')

修改后：

<folder>VOC2007</folder>
   <filename>000021.jpg</filename>
   <path>./VOC2007/JPEGImages/000021.jpg</path>
   <source>
       <database>pascalvoc</database>
   </source>

. . . . . .

2.数据增强

0.图像水平翻转

def imgflip(imgpath):
    """
    水平镜像
    """
    im = cv2.imread(imgpath)
    dst= cv2.flip(img,1,dst=None) #水平镜像
    #newpath=imgpath[:-10]+str('%06d' % int(int(imgpath[-10:-4])))+'.jpg'
    #newpath=r'./imgflip/'+imgpath[-10:]    #自己的保存路径
    cv2.imwrite(imgpath, dst)

1.图像平移


def Image_shift(imgpath):
    """
    平移变换
    :param imgpath:
    :return:
    """
    img = cv2.imread(imgpath)
    rows= img.shape[0]
    cols= img.shape[1]
    dw=100   #往右平移100
    dh=200 #往下平移200
    M = np.float32([[1, 0, -dw], [0, 1, -dh]])
    dst = cv2.warpAffine(img, M, (cols, rows))

     for i in range(0,dh):
        for j in range(0,cols):
                dst[i][j][:]=dst[dh][j][:]
    for i in range(0, rows):
        for j in range(0, dw):
                dst[i][j][:] = dst[i][dw][:]
    #newpath = r'./Image_shift/'+imgpath[-10:]
    cv2.imwrite(imgpath,dst)

2.图像添加高斯噪声

import skimage
import cv2

def addNoise(imgpath):
    """
    添加噪声
    :param imgpath:
    :return:
    """
    img = cv2.imread(imgpath)
    img_=skimage.util.random_noise(img,mode='gaussian',seed =int(imgpath[-10:-4]),mean =0.005)
    # cv2.namedWindow("salt", 0)
    # cv2.imshow("salt", img2)
    # cv2.waitKey(0)
    dst = cv2.normalize(img_, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
    out = cv2.cvtColor(dst, cv2.COLOR_RGB2GRAY)
    cv2.imwrite(imgpath, out)

3.给增强后的图像添加标注，复制标注原图像生成的xml文件，再做相应的修改。

（1）创建文件夹，复制resource-Annotations文件夹下xml文件，分别保存到其他3个文件夹

（2）修改gasuss_noise-Annotations文件夹下xml文件

因为图形位置没做变换，只是添加了噪声，只用修改xml文件夹下的 <filename>和<path>标签。

from xml.etree import ElementTree as ET

def processXml(xmlpath):
    tree = ET.parse(xmlpath)
    root = tree.getroot()
    childs = root.getchildren()
    childs[1].text = xmlpath[-10:-4]+'.jpg'
    childs[2].text = './VOC2007/JPEGImages/'+childs[1].text
    tree.write(xmlpath,'UTF-8')

（3）修改imgflip-Annotations文件夹下xml文件

图形做了水平镜像，所以要修改xml文件夹下的 <filename>、<path>、<xmin>和<xmax>标签。


from xml.etree import ElementTree as ET

def processXml(xmlpath):
    tree = ET.parse(xmlpath)
    root = tree.getroot()
    childs = root.getchildren()
    childs[1].text = xmlpath[-10:-4]+'.jpg'
    childs[2].text = './VOC2007/JPEGImages/'+childs[1].text
    for i in range(6,len(childs)):
        if childs[i].tag == 'object':
            child = childs[i].getchildren()
            child2 = child[4].getchildren()
            wid=int(child2[2].text)-int(child2[0].text)
            x0=int(child2[0].text)+wid
            x2=int(child2[2].text)-wid
            if x0<1275:   #1275为图像宽的一半
                a=x0+(1275-x0)*2
            else:
                a=x0-(x0-1275)*2
            if x2 < 1275:
                b = x2 + (1275 - x2) * 2
            else:
                b = x2 - (x2 - 1275) * 2
            child2[0].text=str(a)
            child2[2].text =str(b)
            tree.write(xmlpath,'UTF-8')

（4）修改Image_shift-Annotations文件夹下xml文件

图形做了平移，所以要修改xml文件夹下的 <filename>、<path>、<xmin>、<ymin>、<xmax>、<ymax>标签。

from xml.etree import ElementTree as ET

def processXml(xmlpath):
    tree = ET.parse(xmlpath)
    root = tree.getroot()
    childs = root.getchildren()
    childs[1].text = xmlpath[-10:-4]+'.jpg'
    childs[2].text = './VOC2007/JPEGImages/'+childs[1].text
    for i in range(6,len(childs)):
        if childs[i].tag == 'object':
            child = childs[i].getchildren()
            child2 = child[4].getchildren()
            child2[0].text=str(int(child2[0].text)-50)
            child2[1].text=str(int(child2[1].text)-100)
            child2[2].text=str(int(child2[2].text)-50)
            child2[3].text=str(int(child2[3].text)-100)
            for j in range(4):  #检查是否超过了图像边界
                if  int(child2[j].text)<0 :
                    child2[j].text='0'
            tree.write(xmlpath,'UTF-8')

最后将所有图像放在同一文件夹JPEGImages下，所有xml文件放在同一文件夹Annotations下，后面待用。

我们随便打开一个xml文件，查看里面的内容

<annotation>
	<folder>VOC2007</folder>
	<filename>000710.jpg</filename>
	<path>./VOC2007/JPEGImages/000710.jpg</path>
	<source>
		<database>pascalvoc</database>
	</source>
	<size>
		<width>2550</width>
		<height>2488</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>huaheng</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>904</xmin>
			<ymin>1328</ymin>
			<xmax>1027</xmax>
			<ymax>1520</ymax>
		</bndbox>
	</object>
	<object>
		<name>huaheng</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>1897</xmin>
			<ymin>1369</ymin>
			<xmax>2032</xmax>
			<ymax>1469</ymax>
		</bndbox>
	</object>
	<object>
		<name>cashang</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>1317</xmin>
			<ymin>1415</ymin>
			<xmax>1784</xmax>
			<ymax>1538</ymax>
		</bndbox>
	</object>
</annotation>

到此，数据集制作完毕。

下一篇，faster-rcnn源码修改，待续。

5月14日更新

博文：https://blog.csdn.net/ganwenbo2011/article/details/90169980

wbgan1994

关注

7
点赞
踩
59

收藏

觉得还不错? 一键收藏
4
评论
python3.5+Tensorflow+Faster R-CNN在ubuntu下训练数据，进行表面缺陷检测(一)

环境：ubuntu18.04+python3.5(我用的anaconda3)+Tensorflow+gtx1060+cuda9.0+cudnn7.3至于环境的搭建，就不赘述，网上很多教程。环境的搭建也是够坑的，我重装了n次。主要是ubuntu对N卡的支持问题，导致开机卡在登陆界面。如果出现开机卡紫屏，参考我另外一篇博文安装显卡驱动Ubuntu18 开机卡紫屏，n卡驱动在线安装我做的是铆钉的...
复制链接

扫一扫

专栏目录