SSD和caffe

最新推荐文章于 2019-06-20 12:44:07 发布

张航瑞

最新推荐文章于 2019-06-20 12:44:07 发布

阅读量2.6k

点赞数

分类专栏： caffe CNN python

python 同时被 3 个专栏收录

7 篇文章 0 订阅

订阅专栏

深度学习CNN caffe

6 篇文章 0 订阅

订阅专栏

caffe CNN

5 篇文章 0 订阅

订阅专栏

SSD框架训练自己的数据集

2016年11月26日 19:08:03 15063人阅读评论(36) 收藏举报

分类：

SSD（10）

目录(?)[-]

SSD demo中详细介绍了如何在VOC数据集上使用SSD进行物体检测的训练和验证。
本文介绍如何使用SSD实现对自己数据集的训练和验证过程，内容包括：

1 数据集的标注
2 数据集的转换
3 使用SSD如何训练
4 使用SSD如何测试
1 数据集的标注　
　　数据的标注使用BBox-Label-Tool工具，该工具使用python实现，使用简单方便。修改后的工具支持多label的标签标注。
该工具生成的标签格式是:
object_number
className x1min y1min x1max y1max
classname x2min y2min x2max y2max
...
1.1 labelTool工具的使用说明
　　BBox-Label-Tool工具实现较简单，原始的git版本使用起来有一些小问题，进行了简单的修改，修改后的版本

 
            
          
#-------------------------------------------------------------------------------
# Name:        Object bounding box label tool
# Purpose:     Label object bboxes for ImageNet Detection data
# Author:      Qiushi
# Created:     06/06/2014

#
#-------------------------------------------------------------------------------
from __future__ import division
from Tkinter import *
import tkMessageBox
from PIL import Image, ImageTk
import os
import glob
import random

# colors for the bboxes
COLORS = ['red', 'blue', 'yellow', 'pink', 'cyan', 'green', 'black']
# image sizes for the examples
SIZE = 256, 256

classLabels=['mat', 'door', 'sofa', 'chair', 'table', 'bed', 'ashcan', 'shoe']

class LabelTool():
    def __init__(self, master):
        # set up the main frame
        self.parent = master
        self.parent.title("LabelTool")
        self.frame = Frame(self.parent)
        self.frame.pack(fill=BOTH, expand=1)
        self.parent.resizable(width = False, height = False)

        # initialize global state
        self.imageDir = ''
        self.imageList= []
        self.egDir = ''
        self.egList = []
        self.outDir = ''
        self.cur = 0
        self.total = 0
        self.category = 0
        self.imagename = ''
        self.labelfilename = ''
        self.tkimg = None

        # initialize mouse state
        self.STATE = {}
        self.STATE['click'] = 0
        self.STATE['x'], self.STATE['y'] = 0, 0

        # reference to bbox
        self.bboxIdList = []
        self.bboxId = None
        self.bboxList = []
        self.hl = None
        self.vl = None
        self.currentClass = ''

        # ----------------- GUI stuff ---------------------
        # dir entry & load
        self.label = Label(self.frame, text = "Image Dir:")
        self.label.grid(row = 0, column = 0, sticky = E)
        self.entry = Entry(self.frame)
        self.entry.grid(row = 0, column = 1, sticky = W+E)
        self.ldBtn = Button(self.frame, text = "Load", command = self.loadDir)
        self.ldBtn.grid(row = 0, column = 2, sticky = W+E)

        # main panel for labeling
        self.mainPanel = Canvas(self.frame, cursor='tcross')
        self.mainPanel.bind("<Button-1>", self.mouseClick)
        self.mainPanel.bind("<Motion>", self.mouseMove)
        self.parent.bind("<Escape>", self.cancelBBox)  # press <Espace> to cancel current bbox
        self.parent.bind("s", self.cancelBBox)
        self.parent.bind("a", self.prevImage) # press 'a' to go backforward
        self.parent.bind("d", self.nextImage) # press 'd' to go forward
        self.mainPanel.grid(row = 1, column = 1, rowspan = 4, sticky = W+N)

        # showing bbox info & delete bbox
        self.lb1 = Label(self.frame, text = 'Bounding boxes:')
        self.lb1.grid(row = 1, column = 2,  sticky = W+N)
        self.listbox = Listbox(self.frame, width = 22, height = 12)
        self.listbox.grid(row = 2, column = 2, sticky = N)
        self.btnDel = Button(self.frame, text = 'Delete', command = self.delBBox)
        self.btnDel.grid(row = 3, column = 2, sticky = W+E+N)
        self.btnClear = Button(self.frame, text = 'ClearAll', command = self.clearBBox)
        self.btnClear.grid(row = 4, column = 2, sticky = W+E+N)
        
        #select class type
        self.classPanel = Frame(self.frame)
        self.classPanel.grid(row = 5, column = 1, columnspan = 10, sticky = W+E)
        label = Label(self.classPanel, text = 'class:')
        label.grid(row = 5, column = 1,  sticky = W+N)
       
        self.classbox = Listbox(self.classPanel,  width = 4, height = 2)
        self.classbox.grid(row = 5,column = 2)
        for each in range(len(classLabels)):
            function = 'select' + classLabels[each]
            print classLabels[each]
            btnMat = Button(self.classPanel, text = classLabels[each], command = getattr(self, function))
            btnMat.grid(row = 5, column = each + 3)
        
        # control panel for image navigation
        self.ctrPanel = Frame(self.frame)
        self.ctrPanel.grid(row = 6, column = 1, columnspan = 2, sticky = W+E)
        self.prevBtn = Button(self.ctrPanel, text='<< Prev', width = 10, command = self.prevImage)
        self.prevBtn.pack(side = LEFT, padx = 5, pady = 3)
        self.nextBtn = Button(self.ctrPanel, text='Next >>', width = 10, command = self.nextImage)
        self.nextBtn.pack(side = LEFT, padx = 5, pady = 3)
        self.progLabel = Label(self.ctrPanel, text = "Progress:     /    ")
        self.progLabel.pack(side = LEFT, padx = 5)
        self.tmpLabel = Label(self.ctrPanel, text = "Go to Image No.")
        self.tmpLabel.pack(side = LEFT, padx = 5)
        self.idxEntry = Entry(self.ctrPanel, width = 5)
        self.idxEntry.pack(side = LEFT)
        self.goBtn = Button(self.ctrPanel, text = 'Go', command = self.gotoImage)
        self.goBtn.pack(side = LEFT)

        # example pannel for illustration
        self.egPanel = Frame(self.frame, border = 10)
        self.egPanel.grid(row = 1, column = 0, rowspan = 5, sticky = N)
        self.tmpLabel2 = Label(self.egPanel, text = "Examples:")
        self.tmpLabel2.pack(side = TOP, pady = 5)
        self.egLabels = []
        for i in range(3):
            self.egLabels.append(Label(self.egPanel))
            self.egLabels[-1].pack(side = TOP)

        # display mouse position
        self.disp = Label(self.ctrPanel, text='')
        self.disp.pack(side = RIGHT)

        self.frame.columnconfigure(1, weight = 1)
        self.frame.rowconfigure(10, weight = 1)

        # for debugging
##        self.setImage()
##        self.loadDir()

    def loadDir(self, dbg = False):
        if not dbg:
            s = self.entry.get()
            self.parent.focus()
            self.category = int(s)
        else:
            s = r'D:\workspace\python\labelGUI'
##        if not os.path.isdir(s):
##            tkMessageBox.showerror("Error!", message = "The specified dir doesn't exist!")
##            return
        # get image list
        self.imageDir = os.path.join(r'./Images', '%d' %(self.category))
        self.imageList = glob.glob(os.path.join(self.imageDir, '*.jpg'))
        if len(self.imageList) == 0:
            print 'No .JPEG images found in the specified dir!'
            return   

      # set up output dir
        self.outDir = os.path.join(r'./Labels', '%d' %(self.category))
        if not os.path.exists(self.outDir):
            os.mkdir(self.outDir)
        
        labeledPicList = glob.glob(os.path.join(self.outDir, '*.txt'))
        
        for label in labeledPicList:
            data = open(label, 'r')
            if '0\n' == data.read():
                data.close()
                continue
            data.close()
            picture = label.replace('Labels', 'Images').replace('.txt', '.jpg')
            if picture in self.imageList:
                self.imageList.remove(picture)
        # default to the 1st image in the collection
        self.cur = 1
        self.total = len(self.imageList)
        self.loadImage()
        print '%d images loaded from %s' %(self.total, s)

    def loadImage(self):
        # load image
        imagepath = self.imageList[self.cur - 1]
        self.img = Image.open(imagepath)
        self.imgSize = self.img.size
        self.tkimg = ImageTk.PhotoImage(self.img)
        self.mainPanel.config(width = max(self.tkimg.width(), 400), height = max(self.tkimg.height(), 400))
        self.mainPanel.create_image(0, 0, image = self.tkimg, anchor=NW)
        self.progLabel.config(text = "%04d/%04d" %(self.cur, self.total))

        # load labels
        self.clearBBox()
        self.imagename = os.path.split(imagepath)[-1].split('.')[0]
        labelname = self.imagename + '.txt'
        self.labelfilename = os.path.join(self.outDir, labelname)
        bbox_cnt = 0
        if os.path.exists(self.labelfilename):
            with open(self.labelfilename) as f:
                for (i, line) in enumerate(f):
                    if i == 0:
                        bbox_cnt = int(line.strip())
                        continue
                    tmp = [int(t.strip()) for t in line.split()]
##                    print tmp
                    self.bboxList.append(tuple(tmp))
                    tmpId = self.mainPanel.create_rectangle(tmp[0], tmp[1], \
                                                            tmp[2], tmp[3], \
                                                            width = 2, \
                                                            outline = COLORS[(len(self.bboxList)-1) % len(COLORS)])
                    self.bboxIdList.append(tmpId)
                    self.listbox.insert(END, '(%d, %d) -> (%d, %d)' %(tmp[0], tmp[1], tmp[2], tmp[3]))
                    self.listbox.itemconfig(len(self.bboxIdList) - 1, fg = COLORS[(len(self.bboxIdList) - 1) % len(COLORS)])

    def saveImage(self):
        with open(self.labelfilename, 'w') as f:
            f.write('%d\n' %len(self.bboxList))
            for bbox in self.bboxList:
                f.write(' '.join(map(str, bbox)) + '\n')
        print 'Image No. %d saved' %(self.cur)


    def mouseClick(self, event):
        if self.STATE['click'] == 0:
            self.STATE['x'], self.STATE['y'] = event.x, event.y
            #self.STATE['x'], self.STATE['y'] = self.imgSize[0], self.imgSize[1]
        else:
            x1, x2 = min(self.STATE['x'], event.x), max(self.STATE['x'], event.x)
            y1, y2 = min(self.STATE['y'], event.y), max(self.STATE['y'], event.y)
            if x2 > self.imgSize[0]:
                x2 = self.imgSize[0]
            if y2 > self.imgSize[1]:
                y2 = self.imgSize[1]                
            self.bboxList.append((self.currentClass, x1, y1, x2, y2))
            self.bboxIdList.append(self.bboxId)
            self.bboxId = None
            self.listbox.insert(END, '(%d, %d) -> (%d, %d)' %(x1, y1, x2, y2))
            self.listbox.itemconfig(len(self.bboxIdList) - 1, fg = COLORS[(len(self.bboxIdList) - 1) % len(COLORS)])
        self.STATE['click'] = 1 - self.STATE['click']

    def mouseMove(self, event):
        self.disp.config(text = 'x: %d, y: %d' %(event.x, event.y))
        if self.tkimg:
            if self.hl:
                self.mainPanel.delete(self.hl)
            self.hl = self.mainPanel.create_line(0, event.y, self.tkimg.width(), event.y, width = 2)
            if self.vl:
                self.mainPanel.delete(self.vl)
            self.vl = self.mainPanel.create_line(event.x, 0, event.x, self.tkimg.height(), width = 2)
        if 1 == self.STATE['click']:
            if self.bboxId:
                self.mainPanel.delete(self.bboxId)
            self.bboxId = self.mainPanel.create_rectangle(self.STATE['x'], self.STATE['y'], \
                                                            event.x, event.y, \
                                                            width = 2, \
                                                            outline = COLORS[len(self.bboxList) % len(COLORS)])

    def cancelBBox(self, event):
        if 1 == self.STATE['click']:
            if self.bboxId:
                self.mainPanel.delete(self.bboxId)
                self.bboxId = None
                self.STATE['click'] = 0

    def delBBox(self):
        sel = self.listbox.curselection()
        if len(sel) != 1 :
            return
        idx = int(sel[0])
        self.mainPanel.delete(self.bboxIdList[idx])
        self.bboxIdList.pop(idx)
        self.bboxList.pop(idx)
        self.listbox.delete(idx)

    def clearBBox(self):
        for idx in range(len(self.bboxIdList)):
            self.mainPanel.delete(self.bboxIdList[idx])
        self.listbox.delete(0, len(self.bboxList))
        self.bboxIdList = []
        self.bboxList = []
        
    def selectmat(self):
        self.currentClass = 'mat'
        self.classbox.delete(0,END)
        self.classbox.insert(0, 'mat')
        self.classbox.itemconfig(0,fg = COLORS[0])
    
    def selectdoor(self):
        self.currentClass = 'door'    
        self.classbox.delete(0,END)    
        self.classbox.insert(0, 'door')
        self.classbox.itemconfig(0,fg = COLORS[0])
    
    def selectsofa(self):
        self.currentClass = 'sofa'    
        self.classbox.delete(0,END)    
        self.classbox.insert(0, 'sofa')
        self.classbox.itemconfig(0,fg = COLORS[0])
        
    def selectchair(self):
        self.currentClass = 'chair'    
        self.classbox.delete(0,END)    
        self.classbox.insert(0, 'chair')
        self.classbox.itemconfig(0,fg = COLORS[0])
        
    def selecttable(self):
        self.currentClass = 'table'    
        self.classbox.delete(0,END)    
        self.classbox.insert(0, 'table')
        self.classbox.itemconfig(0,fg = COLORS[0])
        
    def selectbed(self):
        self.currentClass = 'bed'
        self.classbox.delete(0,END)    
        self.classbox.insert(0, 'bed')
        self.classbox.itemconfig(0,fg = COLORS[0])
        
    def selectashcan(self):
        self.currentClass = 'ashcan'    
        self.classbox.delete(0,END)    
        self.classbox.insert(0, 'ashcan')
        self.classbox.itemconfig(0,fg = COLORS[0])
        
    def selectshoe(self):
        self.currentClass = 'shoe'    
        self.classbox.delete(0,END)    
        self.classbox.insert(0, 'shoe')
        self.classbox.itemconfig(0,fg = COLORS[0])    

    def prevImage(self, event = None):
        self.saveImage()
        if self.cur > 1:
            self.cur -= 1
            self.loadImage()

    def nextImage(self, event = None):
        self.saveImage()
        if self.cur < self.total:
            self.cur += 1
            self.loadImage()

    def gotoImage(self):
        idx = int(self.idxEntry.get())
        if 1 <= idx and idx <= self.total:
            self.saveImage()
            self.cur = idx
            self.loadImage()

##    def setImage(self, imagepath = r'test2.png'):
##        self.img = Image.open(imagepath)
##        self.tkimg = ImageTk.PhotoImage(self.img)
##        self.mainPanel.config(width = self.tkimg.width())
##        self.mainPanel.config(height = self.tkimg.height())
##        self.mainPanel.create_image(0, 0, image = self.tkimg, anchor=NW)

if __name__ == '__main__':
    root = Tk()
    tool = LabelTool(root)
    root.mainloop() 
            
          

　　使用方法：

　　　　 (1) 在BBox-Label-Tool/Images目录下创建保存图片的目录，目录以数字命名(BBox-Label-Tool/Images/1), 然后将待标注的图片copy到1这个目录下;

　　　　 (2) 在BBox-Label-Tool目录下执行命令 python main.py

　　　　 (3) 在工具界面上, Image Dir 框中输入需要标记的目录名(比如 1), 然后点击load按钮, 工具自动将Images/1目录下的图片加载进来;

　　　　　　需要说明一下, 如果目录中的图片已经标注过,点击load时不会被重新加载进来.

　　　　 (4) 该工具支持多类别标注, 画bounding boxs框标定之前,需要先选定类别,然后再画框.

　　　　 (5) 一张图片标注完后, 点击Next>>按钮, 标注下一张图片, 图片label成功后,会在BBox-Label-Tool/Labels对应的目录下生成与图片文件名对应的label文件.

2 数据集的转换

　　caffe训练使用LMDB格式的数据，ssd框架中提供了voc数据格式转换成LMDB格式的脚本。
所以实践中先将BBox-Label-Tool标注的数据转换成voc数据格式，然后再转换成LMDB格式。

2.1 voc数据格式

(1)Annotations中保存的是xml格式的label信息

 
            
          
<?xml version="1.0" ?>
<annotation>
    <folder>VOC2007</folder>
    <filename>1.jpg</filename>
    <source>
        <database>My Database</database>
        <annotation>VOC2007</annotation>
        <image>flickr</image>
        <flickrid>NULL</flickrid>
    </source>
    <owner>
        <flickrid>NULL</flickrid>
        <name>idaneel</name>
    </owner>
    <size>
        <width>320</width>
        <height>240</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>door</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>109</xmin>
            <ymin>3</ymin>
            <xmax>199</xmax>
            <ymax>204</ymax>
        </bndbox>
    </object>
</annotation> 
            
          

(2)ImageSet目录下的Main目录里存放的是用于表示训练的图片集和测试的图片集

(3)JPEGImages目录下存放所有图片集

(4)label目录下保存的是BBox-Label-Tool工具标注好的bounding box坐标文件，
该目录下的文件就是待转换的label标签文件。

2.2 Label转换成VOC数据格式

BBox-Label-Tool工具标注好的bounding box坐标文件转换成VOC数据格式的形式.
具体的转换过程包括了两个步骤：
（1）将BBox-Label-Tool下的txt格式保存的bounding box信息转换成VOC数据格式下以xml方式表示；
（2）生成用于训练的数据集和用于测试的数据集。
用python实现了上述两个步骤的换转。
createXml.py  完成txt到xml的转换；  执行脚本./createXml.py

 
            
          
#!/usr/bin/env python

import os
import sys
import cv2
from itertools import islice
from xml.dom.minidom import Document

labels='label'
imgpath='JPEGImages/'
xmlpath_new='Annotations/'
foldername='VOC2007'


def insertObject(doc, datas):
    obj = doc.createElement('object')
    name = doc.createElement('name')
    name.appendChild(doc.createTextNode(datas[0]))
    obj.appendChild(name)
    pose = doc.createElement('pose')
    pose.appendChild(doc.createTextNode('Unspecified'))
    obj.appendChild(pose)
    truncated = doc.createElement('truncated')
    truncated.appendChild(doc.createTextNode(str(0)))
    obj.appendChild(truncated)
    difficult = doc.createElement('difficult')
    difficult.appendChild(doc.createTextNode(str(0)))
    obj.appendChild(difficult)
    bndbox = doc.createElement('bndbox')
    
    xmin = doc.createElement('xmin')
    xmin.appendChild(doc.createTextNode(str(datas[1])))
    bndbox.appendChild(xmin)
    
    ymin = doc.createElement('ymin')                
    ymin.appendChild(doc.createTextNode(str(datas[2])))
    bndbox.appendChild(ymin)                
    xmax = doc.createElement('xmax')                
    xmax.appendChild(doc.createTextNode(str(datas[3])))
    bndbox.appendChild(xmax)                
    ymax = doc.createElement('ymax')    
    if  '\r' == str(datas[4])[-1] or '\n' == str(datas[4])[-1]:
        data = str(datas[4])[0:-1]
    else:
        data = str(datas[4])
    ymax.appendChild(doc.createTextNode(data))
    bndbox.appendChild(ymax)
    obj.appendChild(bndbox)                
    return obj

def create():
    for walk in os.walk(labels):
        for each in walk[2]:
            fidin=open(walk[0] + '/'+ each,'r')
            objIndex = 0
            for data in islice(fidin, 1, None):        
                objIndex += 1
                data=data.strip('\n')
                datas = data.split(' ')
                if 5 != len(datas):
                    print 'bounding box information error'
                    continue
                pictureName = each.replace('.txt', '.jpg')
                imageFile = imgpath + pictureName
                img = cv2.imread(imageFile)
                imgSize = img.shape
                if 1 == objIndex:
                    xmlName = each.replace('.txt', '.xml')
                    f = open(xmlpath_new + xmlName, "w")
                    doc = Document()
                    annotation = doc.createElement('annotation')
                    doc.appendChild(annotation)
                    
                    folder = doc.createElement('folder')
                    folder.appendChild(doc.createTextNode(foldername))
                    annotation.appendChild(folder)
                    
                    filename = doc.createElement('filename')
                    filename.appendChild(doc.createTextNode(pictureName))
                    annotation.appendChild(filename)
                    
                    source = doc.createElement('source')                
                    database = doc.createElement('database')
                    database.appendChild(doc.createTextNode('My Database'))
                    source.appendChild(database)
                    source_annotation = doc.createElement('annotation')
                    source_annotation.appendChild(doc.createTextNode(foldername))
                    source.appendChild(source_annotation)
                    image = doc.createElement('image')
                    image.appendChild(doc.createTextNode('flickr'))
                    source.appendChild(image)
                    flickrid = doc.createElement('flickrid')
                    flickrid.appendChild(doc.createTextNode('NULL'))
                    source.appendChild(flickrid)
                    annotation.appendChild(source)
                    
                    owner = doc.createElement('owner')
                    flickrid = doc.createElement('flickrid')
                    flickrid.appendChild(doc.createTextNode('NULL'))
                    owner.appendChild(flickrid)
                    name = doc.createElement('name')
                    name.appendChild(doc.createTextNode('idaneel'))
                    owner.appendChild(name)
                    annotation.appendChild(owner)
                    
                    size = doc.createElement('size')
                    width = doc.createElement('width')
                    width.appendChild(doc.createTextNode(str(imgSize[1])))
                    size.appendChild(width)
                    height = doc.createElement('height')
                    height.appendChild(doc.createTextNode(str(imgSize[0])))
                    size.appendChild(height)
                    depth = doc.createElement('depth')
                    depth.appendChild(doc.createTextNode(str(imgSize[2])))
                    size.appendChild(depth)
                    annotation.appendChild(size)
                    
                    segmented = doc.createElement('segmented')
                    segmented.appendChild(doc.createTextNode(str(0)))
                    annotation.appendChild(segmented)            
                    annotation.appendChild(insertObject(doc, datas))
                else:
                    annotation.appendChild(insertObject(doc, datas))
            try:
                f.write(doc.toprettyxml(indent = '    '))
                f.close()
                fidin.close()
            except:
                pass
   
          
if __name__ == '__main__':
    create() 
            
          

createTest.py 生成训练集和测试集标识文件；执行脚本

./createTest.py %startID% %endID% %testNumber%

 
            
          
#!/usr/bin/env python
import os
import sys
import random
try:
    start = int(sys.argv[1])
    end = int(sys.argv[2])
    test = int(sys.argv[3])
    allNum = end-start+1
except:
    print 'Please input picture range'
    print './createTest.py 1 1500 500'
    os._exit(0)
b_list = range(start,end)
blist_webId = random.sample(b_list, test)
blist_webId = sorted(blist_webId) 
allFile = []
testFile = open('ImageSets/Main/test.txt', 'w')
trainFile = open('ImageSets/Main/trainval.txt', 'w')
for i in range(allNum):
    allFile.append(i+1)
for test in blist_webId:
    allFile.remove(test)
    testFile.write(str(test) + '\n')   
for train in allFile:
    trainFile.write(str(train) + '\n')
testFile.close()
trainFile.close() 
            
          

说明：由于BBox-Label-Tool实现相对简单，该工具每次只能对一个类别进行打标签，所以转换脚本

每一次也是对一个类别进行数据的转换，这个问题后续需要优化改进。

优化后的BBox-Label-Tool工具，支持多类别标定，生成的label文件中增加了类别名称信息。

使用时修改classLabels，改写成自己的类别, 修改后的工具代码参见1.1中的main.py

2.3 VOC数据转换成LMDB数据

　　SSD提供了VOC数据到LMDB数据的转换脚本 data/VOC0712/create_list.sh 和 ./data/VOC0712/create_data.sh，这两个脚本是完全针对VOC0712目录下的数据进行的转换。
　　实现中为了不破坏VOC0712目录下的数据内容，针对我们自己的数据集，修改了上面这两个脚本，
将脚本中涉及到VOC0712的信息替换成我们自己的目录信息。
在处理我们的数据集时，将VOC0712替换成indoor。
具体的步骤如下：
　　(1) 在 $HOME/data/VOCdevkit目录下创建indoor目录，该目录中存放自己转换完成的VOC数据集；
　　(2) $CAFFE_ROOT/examples目录下创建indoor目录；
(3) $CAFFE_ROOT/data目录下创建indoor目录，同时将data/VOC0712下的create_list.sh,create_data.sh,labelmap_voc.prototxt
这三个文件copy到indoor目录下，分别重命名为create_list_indoor.sh,create_data_indoor.sh, labelmap_indoor.prototxt
　　(4)对上面新生成的两个create文件进行修改，主要修改是将VOC0712相关的信息替换成indoor
　　修改后的这两个文件分别为：

 
            
          
#!/bin/bash

root_dir=$HOME/data/VOCdevkit/
sub_dir=ImageSets/Main
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

for dataset in trainval test    
do
  dst_file=$bash_dir/$dataset.txt
  if [ -f $dst_file ]
  then
    rm -f $dst_file
  fi
  for name in indoor
  do
    if [[ $dataset == "test" && $name == "VOC2012" ]]
    then
      continue
    fi
    echo "Create list for $name $dataset..."
    dataset_file=$root_dir/$name/$sub_dir/$dataset.txt

    img_file=$bash_dir/$dataset"_img.txt"
    cp $dataset_file $img_file
    sed -i "s/^/$name\/JPEGImages\//g" $img_file
    sed -i "s/$/.jpg/g" $img_file

    label_file=$bash_dir/$dataset"_label.txt"
    cp $dataset_file $label_file
    sed -i "s/^/$name\/Annotations\//g" $label_file
    sed -i "s/$/.xml/g" $label_file

    paste -d' ' $img_file $label_file >> $dst_file

    rm -f $label_file
    rm -f $img_file
  done
  # Generate image name and size infomation.
  if [ $dataset == "test" ]
  then
    $bash_dir/../../build/tools/get_image_size $root_dir $dst_file $bash_dir/$dataset"_name_size.txt"
  fi

  # Shuffle trainval file.
  if [ $dataset == "trainval" ]
  then
    rand_file=$dst_file.random
    cat $dst_file | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);' > $rand_file
    mv $rand_file $dst_file
  fi
done 
            
          

 
            
          
cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir=$cur_dir/../..

cd $root_dir

redo=1
data_root_dir="$HOME/data/VOCdevkit"
dataset_name="indoor"
mapfile="$root_dir/data/$dataset_name/labelmap_indoor.prototxt"
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
width=0
height=0

extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
then
  extra_cmd="$extra_cmd --redo"
fi
for subset in test trainval
do
  python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $root_dir/data/$dataset_name/$subset.txt $data_root_dir/$dataset_name/$db/$dataset_name"_"$subset"_"$db examples/$dataset_name
done 
            
          

        (5)修改labelmap_indoor.prototxt，将该文件中的类别修改成和自己的数据集相匹配，注意需要保留一个label 0 , background类别

 
            
          
item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
}
item {
  name: "door"
  label: 1
  display_name: "door"
} 
            
          

　　完成上面步骤的修改后，可以开始LMDB数据数据的制作，在$CAFFE_ROOT目录下分别运行：

　　./data/indoor/create_list_indoor.sh

　　./data/indoor/create_data_indoor.sh

　　命令执行完毕后，可以在$CAFFE_ROOT/indoor目录下查看转换完成的LMDB数据数据。

3 使用SSD进行自己数据集的训练

训练时使用ssd demo中提供的预训练好的VGGnet model : VGG_ILSVRC_16_layers_fc_reduced.caffemodel
将该模型保存到$CAFFE_ROOT/models/VGGNet下。
将ssd_pascal.py copy一份 ssd_pascal_indoor.py文件， 根据自己的数据集修改ssd_pascal_indoor.py
主要修改点：
 （1）train_data和test_data修改成指向自己的数据集LMDB
　　　train_data = "examples/indoor/indoor_trainval_lmdb"
            test_data = "examples/indoor/indoor_test_lmdb"
（2） num_test_image该变量修改成自己数据集中测试数据的数量
（3）num_classes 该变量修改成自己数据集中 标签类别数量数 + 1

针对我的数据集，ssd_pascal_indoor.py的内容为：

 
            
          
from __future__ import print_function
import caffe
from caffe.model_libs import *
from google.protobuf import text_format

import math
import os
import shutil
import stat
import subprocess
import sys

# Add extra layers on top of a "base" network (e.g. VGGNet or Inception).
def AddExtraLayers(net, use_batchnorm=True):
    use_relu = True

    # Add additional convolutional layers.
    from_layer = net.keys()[-1]
    # TODO(weiliu89): Construct the name using the last layer to avoid duplication.
    out_layer = "conv6_1"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 1, 0, 1)

    from_layer = out_layer
    out_layer = "conv6_2"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 512, 3, 1, 2)

    for i in xrange(7, 9):
      from_layer = out_layer
      out_layer = "conv{}_1".format(i)
      ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 128, 1, 0, 1)

      from_layer = out_layer
      out_layer = "conv{}_2".format(i)
      ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 1, 2)

    # Add global pooling layer.
    name = net.keys()[-1]
    net.pool6 = L.Pooling(net[name], pool=P.Pooling.AVE, global_pooling=True)

    return net


### Modify the following parameters accordingly ###
# The directory which contains the caffe code.
# We assume you are running the script at the CAFFE_ROOT.
caffe_root = os.getcwd()

# Set true if you want to start training right after generating all files.
run_soon = True
# Set true if you want to load from most recently saved snapshot.
# Otherwise, we will load from the pretrain_model defined below.
resume_training = True
# If true, Remove old model files.
remove_old_models = False

# The database file for training data. Created by data/VOC0712/create_data.sh
train_data = "examples/indoor/indoor_trainval_lmdb"
# The database file for testing data. Created by data/VOC0712/create_data.sh
test_data = "examples/indoor/indoor_test_lmdb"
# Specify the batch sampler.
resize_width = 300
resize_height = 300
resize = "{}x{}".format(resize_width, resize_height)
batch_sampler = [
        {
                'sampler': {
                        },
                'max_trials': 1,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.1,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.3,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.5,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.7,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.9,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'max_jaccard_overlap': 1.0,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        ]
train_transform_param = {
        'mirror': True,
        'mean_value': [104, 117, 123],
        'resize_param': {
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [
                        P.Resize.LINEAR,
                        P.Resize.AREA,
                        P.Resize.NEAREST,
                        P.Resize.CUBIC,
                        P.Resize.LANCZOS4,
                        ],
                },
        'emit_constraint': {
            'emit_type': caffe_pb2.EmitConstraint.CENTER,
            }
        }
test_transform_param = {
        'mean_value': [104, 117, 123],
        'resize_param': {
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [P.Resize.LINEAR],
                },
        }

# If true, use batch norm for all newly added layers.
# Currently only the non batch norm version has been tested.
use_batchnorm = False
# Use different initial learning rate.
if use_batchnorm:
    base_lr = 0.0004
else:
    # A learning rate for batch_size = 1, num_gpus = 1.
    base_lr = 0.00004

# Modify the job name if you want.
job_name = "SSD_{}".format(resize)
# The name of the model. Modify it if you want.
model_name = "VGG_VOC0712_{}".format(job_name)

# Directory which stores the model .prototxt file.
save_dir = "models/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the snapshot of models.
snapshot_dir = "models/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the job script and log file.
job_dir = "jobs/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the detection results.
output_result_dir = "{}/data/VOCdevkit/results/VOC2007/{}/Main".format(os.environ['HOME'], job_name)

# model definition files.
train_net_file = "{}/train.prototxt".format(save_dir)
test_net_file = "{}/test.prototxt".format(save_dir)
deploy_net_file = "{}/deploy.prototxt".format(save_dir)
solver_file = "{}/solver.prototxt".format(save_dir)
# snapshot prefix.
snapshot_prefix = "{}/{}".format(snapshot_dir, model_name)
# job script path.
job_file = "{}/{}.sh".format(job_dir, model_name)

# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh
name_size_file = "data/indoor/test_name_size.txt"
# The pretrained model. We use the Fully convolutional reduced (atrous) VGGNet.
pretrain_model = "models/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel"
# Stores LabelMapItem.
label_map_file = "data/indoor/labelmap_indoor.prototxt"

# MultiBoxLoss parameters.
num_classes = 2
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
neg_pos_ratio = 3.
loc_weight = (neg_pos_ratio + 1.) / 4.
multibox_loss_param = {
    'loc_loss_type': P.MultiBoxLoss.SMOOTH_L1,
    'conf_loss_type': P.MultiBoxLoss.SOFTMAX,
    'loc_weight': loc_weight,
    'num_classes': num_classes,
    'share_location': share_location,
    'match_type': P.MultiBoxLoss.PER_PREDICTION,
    'overlap_threshold': 0.5,
    'use_prior_for_matching': True,
    'background_label_id': background_label_id,
    'use_difficult_gt': train_on_diff_gt,
    'do_neg_mining': True,
    'neg_pos_ratio': neg_pos_ratio,
    'neg_overlap': 0.5,
    'code_type': code_type,
    }
loss_param = {
    'normalization': normalization_mode,
    }

# parameters for generating priors.
# minimum dimension of input image
min_dim = 300
# conv4_3 ==> 38 x 38
# fc7 ==> 19 x 19
# conv6_2 ==> 10 x 10
# conv7_2 ==> 5 x 5
# conv8_2 ==> 3 x 3
# pool6 ==> 1 x 1
mbox_source_layers = ['conv4_3', 'fc7', 'conv6_2', 'conv7_2', 'conv8_2', 'pool6']
# in percent %
min_ratio = 20
max_ratio = 95
step = int(math.floor((max_ratio - min_ratio) / (len(mbox_source_layers) - 2)))
min_sizes = []
max_sizes = []
for ratio in xrange(min_ratio, max_ratio + 1, step):
  min_sizes.append(min_dim * ratio / 100.)
  max_sizes.append(min_dim * (ratio + step) / 100.)
min_sizes = [min_dim * 10 / 100.] + min_sizes
max_sizes = [[]] + max_sizes
aspect_ratios = [[2], [2, 3], [2, 3], [2, 3], [2, 3], [2, 3]]
# L2 normalize conv4_3.
normalizations = [20, -1, -1, -1, -1, -1]
# variance used to encode/decode prior bboxes.
if code_type == P.PriorBox.CENTER_SIZE:
  prior_variance = [0.1, 0.1, 0.2, 0.2]
else:
  prior_variance = [0.1]
flip = True
clip = True

# Solver parameters.
# Defining which GPUs to use.
gpus = "0"
gpulist = gpus.split(",")
num_gpus = len(gpulist)

# Divide the mini-batch to different GPUs.
batch_size = 4
accum_batch_size = 32
iter_size = accum_batch_size / batch_size
solver_mode = P.Solver.CPU
device_id = 0
batch_size_per_device = batch_size
if num_gpus > 0:
  batch_size_per_device = int(math.ceil(float(batch_size) / num_gpus))
  iter_size = int(math.ceil(float(accum_batch_size) / (batch_size_per_device * num_gpus)))
  solver_mode = P.Solver.GPU
  device_id = int(gpulist[0])

if normalization_mode == P.Loss.NONE:
  base_lr /= batch_size_per_device
elif normalization_mode == P.Loss.VALID:
  base_lr *= 25. / loc_weight
elif normalization_mode == P.Loss.FULL:
  # Roughly there are 2000 prior bboxes per image.
  # TODO(weiliu89): Estimate the exact # of priors.
  base_lr *= 2000.

# Which layers to freeze (no backward) during training.
freeze_layers = ['conv1_1', 'conv1_2', 'conv2_1', 'conv2_2']

# Evaluate on whole test set.
num_test_image = 800
test_batch_size = 1
test_iter = num_test_image / test_batch_size

solver_param = {
    # Train parameters
    'base_lr': base_lr,
    'weight_decay': 0.0005,
    'lr_policy': "step",
    'stepsize': 40000,
    'gamma': 0.1,
    'momentum': 0.9,
    'iter_size': iter_size,
    'max_iter': 60000,
    'snapshot': 40000,
    'display': 10,
    'average_loss': 10,
    'type': "SGD",
    'solver_mode': solver_mode,
    'device_id': device_id,
    'debug_info': False,
    'snapshot_after_train': True,
    # Test parameters
    'test_iter': [test_iter],
    'test_interval': 10000,
    'eval_type': "detection",
    'ap_version': "11point",
    'test_initialization': False,
    }

# parameters for generating detection output.
det_out_param = {
    'num_classes': num_classes,
    'share_location': share_location,
    'background_label_id': background_label_id,
    'nms_param': {'nms_threshold': 0.45, 'top_k': 400},
    'save_output_param': {
        'output_directory': output_result_dir,
        'output_name_prefix': "comp4_det_test_",
        'output_format': "VOC",
        'label_map_file': label_map_file,
        'name_size_file': name_size_file,
        'num_test_image': num_test_image,
        },
    'keep_top_k': 200,
    'confidence_threshold': 0.01,
    'code_type': code_type,
    }

# parameters for evaluating detection results.
det_eval_param = {
    'num_classes': num_classes,
    'background_label_id': background_label_id,
    'overlap_threshold': 0.5,
    'evaluate_difficult_gt': False,
    'name_size_file': name_size_file,
    }

### Hopefully you don't need to change the following ###
# Check file.
check_if_exist(train_data)
check_if_exist(test_data)
check_if_exist(label_map_file)
check_if_exist(pretrain_model)
make_if_not_exist(save_dir)
make_if_not_exist(job_dir)
make_if_not_exist(snapshot_dir)

# Create train net.
net = caffe.NetSpec()
net.data, net.label = CreateAnnotatedDataLayer(train_data, batch_size=batch_size_per_device,
        train=True, output_label=True, label_map_file=label_map_file,
        transform_param=train_transform_param, batch_sampler=batch_sampler)

VGGNetBody(net, from_layer='data', fully_conv=True, reduced=True, dilated=True,
    dropout=False, freeze_layers=freeze_layers)

AddExtraLayers(net, use_batchnorm)

mbox_layers = CreateMultiBoxHead(net, data_layer='data', from_layers=mbox_source_layers,
        use_batchnorm=use_batchnorm, min_sizes=min_sizes, max_sizes=max_sizes,
        aspect_ratios=aspect_ratios, normalizations=normalizations,
        num_classes=num_classes, share_location=share_location, flip=flip, clip=clip,
        prior_variance=prior_variance, kernel_size=3, pad=1)

# Create the MultiBoxLossLayer.
name = "mbox_loss"
mbox_layers.append(net.label)
net[name] = L.MultiBoxLoss(*mbox_layers, multibox_loss_param=multibox_loss_param,
        loss_param=loss_param, include=dict(phase=caffe_pb2.Phase.Value('TRAIN')),
        propagate_down=[True, True, False, False])

with open(train_net_file, 'w') as f:
    print('name: "{}_train"'.format(model_name), file=f)
    print(net.to_proto(), file=f)
shutil.copy(train_net_file, job_dir)

# Create test net.
net = caffe.NetSpec()
net.data, net.label = CreateAnnotatedDataLayer(test_data, batch_size=test_batch_size,
        train=False, output_label=True, label_map_file=label_map_file,
        transform_param=test_transform_param)

VGGNetBody(net, from_layer='data', fully_conv=True, reduced=True, dilated=True,
    dropout=False, freeze_layers=freeze_layers)

AddExtraLayers(net, use_batchnorm)

mbox_layers = CreateMultiBoxHead(net, data_layer='data', from_layers=mbox_source_layers,
        use_batchnorm=use_batchnorm, min_sizes=min_sizes, max_sizes=max_sizes,
        aspect_ratios=aspect_ratios, normalizations=normalizations,
        num_classes=num_classes, share_location=share_location, flip=flip, clip=clip,
        prior_variance=prior_variance, kernel_size=3, pad=1)

conf_name = "mbox_conf"
if multibox_loss_param["conf_loss_type"] == P.MultiBoxLoss.SOFTMAX:
  reshape_name = "{}_reshape".format(conf_name)
  net[reshape_name] = L.Reshape(net[conf_name], shape=dict(dim=[0, -1, num_classes]))
  softmax_name = "{}_softmax".format(conf_name)
  net[softmax_name] = L.Softmax(net[reshape_name], axis=2)
  flatten_name = "{}_flatten".format(conf_name)
  net[flatten_name] = L.Flatten(net[softmax_name], axis=1)
  mbox_layers[1] = net[flatten_name]
elif multibox_loss_param["conf_loss_type"] == P.MultiBoxLoss.LOGISTIC:
  sigmoid_name = "{}_sigmoid".format(conf_name)
  net[sigmoid_name] = L.Sigmoid(net[conf_name])
  mbox_layers[1] = net[sigmoid_name]

net.detection_out = L.DetectionOutput(*mbox_layers,
    detection_output_param=det_out_param,
    include=dict(phase=caffe_pb2.Phase.Value('TEST')))
net.detection_eval = L.DetectionEvaluate(net.detection_out, net.label,
    detection_evaluate_param=det_eval_param,
    include=dict(phase=caffe_pb2.Phase.Value('TEST')))

with open(test_net_file, 'w') as f:
    print('name: "{}_test"'.format(model_name), file=f)
    print(net.to_proto(), file=f)
shutil.copy(test_net_file, job_dir)

# Create deploy net.
# Remove the first and last layer from test net.
deploy_net = net
with open(deploy_net_file, 'w') as f:
    net_param = deploy_net.to_proto()
    # Remove the first (AnnotatedData) and last (DetectionEvaluate) layer from test net.
    del net_param.layer[0]
    del net_param.layer[-1]
    net_param.name = '{}_deploy'.format(model_name)
    net_param.input.extend(['data'])
    net_param.input_shape.extend([
        caffe_pb2.BlobShape(dim=[1, 3, resize_height, resize_width])])
    print(net_param, file=f)
shutil.copy(deploy_net_file, job_dir)

# Create solver.
solver = caffe_pb2.SolverParameter(
        train_net=train_net_file,
        test_net=[test_net_file],
        snapshot_prefix=snapshot_prefix,
        **solver_param)

with open(solver_file, 'w') as f:
    print(solver, file=f)
shutil.copy(solver_file, job_dir)

max_iter = 0
# Find most recent snapshot.
for file in os.listdir(snapshot_dir):
  if file.endswith(".solverstate"):
    basename = os.path.splitext(file)[0]
    iter = int(basename.split("{}_iter_".format(model_name))[1])
    if iter > max_iter:
      max_iter = iter

train_src_param = '--weights="{}" \\\n'.format(pretrain_model)
if resume_training:
  if max_iter > 0:
    train_src_param = '--snapshot="{}_iter_{}.solverstate" \\\n'.format(snapshot_prefix, max_iter)

if remove_old_models:
  # Remove any snapshots smaller than max_iter.
  for file in os.listdir(snapshot_dir):
    if file.endswith(".solverstate"):
      basename = os.path.splitext(file)[0]
      iter = int(basename.split("{}_iter_".format(model_name))[1])
      if max_iter > iter:
        os.remove("{}/{}".format(snapshot_dir, file))
    if file.endswith(".caffemodel"):
      basename = os.path.splitext(file)[0]
      iter = int(basename.split("{}_iter_".format(model_name))[1])
      if max_iter > iter:
        os.remove("{}/{}".format(snapshot_dir, file))

# Create job file.
with open(job_file, 'w') as f:
  f.write('cd {}\n'.format(caffe_root))
  f.write('./build/tools/caffe train \\\n')
  f.write('--solver="{}" \\\n'.format(solver_file))
  f.write(train_src_param)
  if solver_param['solver_mode'] == P.Solver.GPU:
    f.write('--gpu {} 2>&1 | tee {}/{}.log\n'.format(gpus, job_dir, model_name))
  else:
    f.write('2>&1 | tee {}/{}.log\n'.format(job_dir, model_name))

# Copy the python script to job_dir.
py_file = os.path.abspath(__file__)
shutil.copy(py_file, job_dir)

# Run the job.
os.chmod(job_file, stat.S_IRWXU)
if run_soon:
  subprocess.call(job_file, shell=True) 
            
          

训练命令：
python examples/ssd/ssd_pascal_indoor.py

4 测试

SSD框架中提供了测试代码，有C++版本和python版本

`4.1` `c++版本`

编译完SSD后，C++版本的的可执行文件存放目录： .build_release/examples/ssd/ssd_detect.bin

测试命令  ./.build_release/examples/ssd/ssd_detect.bin models/VGGNet/indoor/deploy.prototxt   models/VGGNet/indoor/VGG_VOC0712_SSD_300x300_iter_60000.caffemodel pictures.txt

其中pictures.txt中保存的是待测试图片的list

`4.2` `python版本`

python 版本的测试过程参见examples/detection.ipynb

参考：
　1 Faster RCNN 训练自己的数据集(Matlab,python版本)及制作VOC2007格式数据集
 2 SSD的配置及运行

查看评论

21楼 qq_40091489 2018-02-07 13:53发表 [回复] [引用] [举报]

博主您好，我训练数据的时候为什么结果都是满图的标签框框呢请问您我这是什么问题呢是需要调节什么参数吗谢谢您

20楼 qq_36236578 2018-01-30 10:11发表 [回复] [引用] [举报]

为什么我用createXml转化txt文件为xml之后少一行数据，比如我原来图片有3个人，转化完了就变成了2个人？

19楼 tangali 2017-12-28 10:39发表 [回复] [引用] [举报]

博主，参照您的教程，请问报如下的错误该怎么修改：cannot copy param 0 weights from layer conv4_3_norm_mbox_conf'; shape mismatch. Source param shape is 84 512 3 3; target param shape is 36 512 3 3. 我修改的类别数是9（含背景）。

18楼 the_sun1 2017-09-26 16:38发表 [回复] [引用] [举报]

博主你好，请问您能够把SSD车辆识别的系统最后在模拟运行环境里跑的DEMO视频，发我一下吗？我自己总是跑不成功~

17楼 wolf2345 2017-08-23 11:06发表 [回复] [引用] [举报]

I0823 10:45:44.551381 5608 db_lmdb.cpp:35] Opened lmdb examples/indoor/indoor_test_lmdb
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_S_create
*** Aborted at 1503456344 (unix time) try "date -d @1503456344" if you are using GNU date ***
PC: @ 0x7f0ff4d11c37 (unknown)
*** SIGABRT (@0x3e8000015de) received by PID 5598 (TID 0x7f0fc8b63700) from PID 5598; stack trace: ***
@ 0x7f0ff4d11cb0 (unknown)
@ 0x7f0ff4d11c37 (unknown)
@ 0x7f0ff4d15028 (unknown)
@ 0x7f0ff5ad83e5 (unknown)
@ 0x7f0ff5ad61d6 (unknown)
@ 0x7f0ff5ad6221 (unknown)
@ 0x7f0ff5ad6463 (unknown)

博主，这个是什么原因？数据集大小有要求吗？

16楼 qq_24550699 2017-08-09 11:20发表 [回复] [引用] [举报]

你们输出的xml都是和VOC标准格式一样的么？为啥我的xml都是一行一行的，每个变量（字符串）一行写出？

15楼 qq_24550699 2017-08-09 11:09发表 [回复] [引用] [举报]

博主为啥xml的输出结果会自动换行：
<bndbox>
<xmin>
559
</xmin>
<ymin>
99
</ymin>

14楼 Starsboba 2017-05-16 22:16发表 [回复] [引用] [举报]

我感觉还有一个疑问，应该就是train.prototxt中的conv_4_3_num_mbox_conf它那里是numoutput84维的，然后我训练出来的caffemodel只有8维，有人告诉我说train.prototxt不需要修改，但是需要修改.py测试代码。我在ssd的issue里281看到一种针对维度不等的修改方法，然后按照它的方式，可以跑，但是video视频上全是b-box，而且是各种各样的sheep car等等的pascal20类全出来了，但是我们不是只做二分类的吗？有点蒙，还在摸索当中，如果博主也还在做ssd的研究，希望找到解决方法后分享一下，不胜感激

Re: Starsboba 2017-05-17 11:11发表 [回复] [引用] [举报]

回复Starsboba：博主你好，我已经解决了我的问题了，原因是我在训练之前的一些路径设置错了，事实上是不需要修改prototxt文件里面的信息的，都是自动生成的，当然我们测试的时候只需要修改测试代码里面的一些路径，numclasses，lab_map路径等等一些参数就可以运行了，楼主的博文的帮助非常大，再次感谢

13楼 Starsboba 2017-05-16 09:12发表 [回复] [引用] [举报]

你好博主，我按照你的方法成功训练了一个检测行人的2分类，但是在使用video_ssd_pascal.py测试的时候出现了维度不相等的情况，conv4_3_norm_mbox_conf这里不等，是不是我们在训练的时候要改一下train.prototxt，因为我们检测的类别变了，所以一些层的输出维度是不是就变了。

Re: 2014wzy 2017-05-16 10:39发表 [回复] [引用] [举报]

回复Starsboba：是的，需要改的，原来的21类别改为2个类别，其他的也相应的修改，
前几天我使用官网提供的训练迭代60000次的模型进行在线视频测试，出现维度不同了，SSD代码和模型常在更新，如果遇到“shape mismatch. Source param shape is 12 512 3 3 (55296); target param shape is 16 512 3 3 (73728)”这种错误，请一定去github下载最新的模型。最新模型看到各种链接都是失效了，奇怪。

12楼 Starsboba 2017-05-14 16:18发表 [回复] [引用] [举报]

好像train.prototxt里面的维度需要修改吧二分类和pascal20类输出应该不同，我按照你的方法训练好了过后做测试发现con4_3_mbox的维度不相等

Re: longlbj 2017-12-24 11:45发表 [回复] [引用] [举报]

回复Starsboba：层主，您好，请问你说的train.prototxt里面的维度需要修改，是哪个目录下的呢？我也是训练自己的数据，二分类问题，每次运行python examples/ssd/ssd_pascal.py就报错ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aborted (core dumped)，其中ssd_pascal.py里面我修改了num_classes = 2

Re: 2014wzy 2017-05-16 10:40发表 [回复] [引用] [举报]

回复Starsboba：是的，需要修改

11楼荪荪 2017-05-09 20:51发表 [回复] [引用] [举报]

博主，感觉前后篇幅不是一致的啊？是不是我理解错了？还是博主自己不是按照这个流程从头走到尾的呢？

10楼 github_38690312 2017-05-08 08:10发表 [回复] [引用] [举报]

博主你好为什么我的main文件夹下默认是这4类文件：/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_test.txt
/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_train.txt
/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_trainval.txt
/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_val.txt

9楼 github_38690312 2017-05-08 08:05发表 [回复] [引用] [举报]

Re: github_38690312 2017-05-08 08:12发表 [回复] [引用] [举报]

回复github_38690312：我的main文件夹下是每个种类的这四类txt文件：
/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_test.txt
/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_train.txt
/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_trainval.txt
/home/felix/data/VOCdevkit/VOC2007/ImageSets/Main/bird_val.txt
跟你说的不一样，那出问题了吗

8楼 HRLSW 2017-05-06 16:21发表 [回复] [引用] [举报]

博主，你好！我运行了.createXml .createTest两个脚本，没有报错，但是没有生成相应的文件是为什么？

Re: youngstear 2017-05-18 14:59发表 [回复] [引用] [举报]

回复HRLSW：与你一样的问题请问解决了嘛？

7楼 qhlcs_zj_ee 2017-04-09 22:44发表 [回复] [引用] [举报]

楼主我想问一下：在自己数据集上训练的时候，假设只检测door这一类，而某张图片上不包含door，你是怎么标注这张图片的？还要输入给网络做训练么？理论上是图片越多越好，而且可以把这张图片标注为neg，所以想问问你是怎么处理的？

6楼 su1993312312 2017-03-30 15:52发表 [回复] [引用] [举报]

我用自己生成的LMDB格式训练的时候，运行到加载Opened lmdb data/VOCdevkit/webface/lmdb/webface_trainval_lmdb 后总是出现下面错误，自己试过用训练数据来作为测试数据，依然出现下面错误
I0213 21:19:32.094907 27669 net.cpp:408] conv4_2 -> conv4_2
data_transformer.cpp:621] Check failed: mean_values_.size() == 1 || mean_values_.size() == img_channels Specify either 1 mean_value or as many as channels: 1
*** Check failure stack trace: ***
@ 0x7fb2b9d81daa (unknown)
@ 0x7fb2b9d81ce4 (unknown)
@ 0x7fb2b9d816e6 (unknown)
@ 0x7fb2b9d84687 (unknown)
@ 0x7fb2ba4d169f caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4d476c caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4dda0b caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4ddac8 caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4ddb7e caffe::DataTransformer<>::Transform()
@ 0x7fb2ba5ff2dc caffe::AnnotatedDataLayer<>::load_batch()
I0213 21:19:32.115164 27669 net.cpp:150] Setting up conv4_2
博主，我也遇到这个问题了，请问怎么解决啊

5楼键盘王者 2017-03-24 09:40发表 [回复] [引用] [举报]

博主，我训练了自己的数据集，2000张四类，迭代六七千次后，loss一直在0.5-0.6之间波动，不在下降，请问我哪个参数调的不合适呀？

Re: su1993312312 2017-04-09 10:27发表 [回复] [引用] [举报]

回复键盘王者：请问你在训练自己的数据集时修改了哪些东西啊？ssd_pascal.py与train.txt?为啥我一直报错啊。你可以发我一份你的ssd_pascal.py?我参考一下。1095617118@qq.ocm

Re: qq_40091489 2018-02-05 17:07发表 [回复] [引用] [举报]

回复su1993312312：请问您解决了吗我和您一样的问题能发给我一份ssd_pascal.py与train.txt参考吗谢谢啦 1023419671@qq.com

4楼 KevinDai0319 2017-03-01 13:30发表 [回复] [引用] [举报]

您好，请问我训练时候loss一直为零是什么原因呢？

I0301 13:23:46.032362 5258 solver.cpp:243] Iteration 10, loss = 0
I0301 13:23:46.032472 5258 solver.cpp:259] Train net output #0: mbox_loss = 0 (* 1 = 0 loss)
I0301 13:23:46.032485 5258 sgd_solver.cpp:138] Iteration 10, lr = 0.001

Re: WhiteCipher 2017-03-28 22:39发表 [回复] [引用] [举报]

回复KevinDai0319：我猜是你的XML数据文件中有非整型的数

Re: 2014wzy 2017-03-01 17:24发表 [回复] [引用] [举报]

回复KevinDai0319：loss为0说明两个问题，第一个你训练的太到位了，已经实现了正确率100%，第二个你的loss函数选择的有问题或者已经失去梯度，无法产生变化

3楼 zoufangyu1987 2017-02-24 16:30发表 [回复] [引用] [举报]

博主：
您好！
请问标定工具我加载文件夹"1"报错
NO .JPEG images found in the specified dir!
打印self.imageDir: ./Images\1
self.imageList: []
没读到图像，我PYTHON没学过，不太会玩，这个是哪里城路径不对还是怎么回事？

Re: 2014wzy 2017-02-24 19:21发表 [回复] [引用] [举报]

回复zoufangyu1987：给你分享这个标注工具吧 git clone https://github.com/tzutalin/labelImg.git
我现在用的是这个，很方便

Re: zoufangyu1987 2017-02-27 15:09发表 [回复] [引用] [举报]

回复2014wzy：好的，谢谢您

2楼 xiaoyong_99 2017-02-13 21:45发表 [回复] [引用] [举报]

博主，你好！
我用自己生成的LMDB格式训练的时候，运行到加载Opened lmdb data/VOCdevkit/webface/lmdb/webface_trainval_lmdb 后总是出现下面错误，自己试过用训练数据来作为测试数据，依然出现下面错误
I0213 21:19:32.094907 27669 net.cpp:408] conv4_2 -> conv4_2
data_transformer.cpp:621] Check failed: mean_values_.size() == 1 || mean_values_.size() == img_channels Specify either 1 mean_value or as many as channels: 1
*** Check failure stack trace: ***
@ 0x7fb2b9d81daa (unknown)
@ 0x7fb2b9d81ce4 (unknown)
@ 0x7fb2b9d816e6 (unknown)
@ 0x7fb2b9d84687 (unknown)
@ 0x7fb2ba4d169f caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4d476c caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4dda0b caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4ddac8 caffe::DataTransformer<>::Transform()
@ 0x7fb2ba4ddb7e caffe::DataTransformer<>::Transform()
@ 0x7fb2ba5ff2dc caffe::AnnotatedDataLayer<>::load_batch()
I0213 21:19:32.115164 27669 net.cpp:150] Setting up conv4_2

Re: su1993312312 2017-03-30 15:53发表 [回复] [引用] [举报]

回复xiaoyong_99：麻烦问一下，你这个问题解决了ma ?

1楼 aoxiaofu 2016-12-15 09:55发表 [回复] [引用] [举报]

链接里面的VGG_ILSVRC_16_layers_fc_reduced.caffemodel下不下来，博主能分享一下吗？github上也下载不下来的说。。。。。。。

Re: 2014wzy 2016-12-16 09:47发表 [回复] [引用] [举报]

回复aoxiaofu：链接：http://pan.baidu.com/s/1eSkMnXS 密码：0p4g

查看更多评论

SSD安装及训练自己的数据集

zhang_shuai12
2016-08-28 17:58:58
28038

最近一直在搞object detection玩，之前用的是faster-rcnn，准确率方面73.2%，效果还不错，但是识别速度有点欠缺，我用的GPU是GTX980ti, 识别速度大概是15fps.最...

用SSD框架训练自己的数据集

chris_pei
2018-01-17 16:52:06
235

本文介绍如何使用SSD训练自己的数据集，内容包括数据集的转化，使用SSD进行训练1、VOC数据集的介绍VOC的数据格式：(1)annotation中保存的是xml格式的label信...

SSD配置、训练、测试以及应用到自己的数据集

wei_guo_xd
2017-06-25 20:54:43
2641

git clone https://github.com/weiliu89/caffe.git（上面的版本可能存在问题，最好是在https://github.com/weiliu89/caffe/...

用SSD训练自己的数据集(VOC2007格式)

zhy8623080
2017-06-13 17:11:21
3498

用SSD训练自己的数据集(VOC2007格式)一. 配置caffe环境ubunt16.04下caffe环境安装二. 下载,编译及测试ssd源码(一)下载源码github链接或者执行 git clone...

用SSD训练自己的数据集

dongfang1984
2017-07-07 11:11:09
1030

1构建数据集先来看一下我们构建数据集合应该是什么样的,假设总数据为1000张。为了方便，我们将数据放在／home／bingolwang/data 文件夹下。/home/bingol...

ssd训练自己的数据集

tmosk
2017-09-05 16:44:45
293

1.准备数据集利用labelimg 工具得到图片和对应的xml文件然后将数据及利用编程得到train,trainval,val,test等txt文件，其中存储的是图片的名字没有后缀的，例如图...

目标检测SSD：训练自己的数据集

wfei101
2017-12-02 10:17:57
720

最近一直在搞object detection玩，之前用的是faster-rcnn，准确率方面73.2%，效果还不错，但是识别速度有点欠缺，我用的GPU是GTX980ti, 识别速度大概是15fps.最...

【SSD】用caffe-ssd框架自带VGG网络训练自己的数据集

前言网上现有的教程几乎全都只是翻译或者直接使用VOC数据集。我的数据集是从ILSVRC、ImageNet拿来的，颜色通道不统一，xml文件内容格式不统一。整个过程遇到了大量问题，也写了很多脚本工...

renhanchi
2017-11-01 13:21:42
3606

SSD训练与数据集方面的要点

shawncheer
2016-12-10 07:25:32
1212

首先有几个博客地址：http://blog.csdn.net/sinat_30071459/article/details/50723212制作数据集文章作者的其他作品也是很值得看的。ht...

深度学习ssd检测模型训练自己的数据集

liu_xiao_cheng
2017-11-17 17:24:03
631

之前参考过几篇文章发现没有可以走通的训练自己的数据集的例子，根据网上的几篇文章自己做了个完整的，并且已验证训练出的modelpaper:https://arxiv.org/abs/1512...

Tensorflow-SSD测试及训练自己的数据集

ei1990
2017-07-18 10:56:18
4370

一、软件 Python + Tensorflow + OpenCV3二、安装测试 1、ssd_notebook.ipynb测试（1）下载程序包并解压。源代码GitHub： balanca...

windows下SSD训练自己的数据

d408550969
2017-07-02 16:04:49
1146

标注软件：https://github.com/tzutalin/labelImg使用这个软件需要python2.7以上，pyqt4，lxml。先安装anaconda2.（python2.7）...

SSD: Single Shot MultiBox Detector 训练KITTI数据集（2）

前言博主在上篇中花了很大篇幅讲解如何一步步把KITTI原始数据做成了SSD可以训练的格式，接下来就可以使用相关caffe代码实现SSD的训练了。下载VGG预训练模型将 SSD 用于自己...

Jesse_Mx
2017-04-11 10:52:18
8115

SSD训练自己的数据集

lk123400
2017-03-10 09:05:57
4123

Faster-rcnn训练了自己的数据集，感觉效果还行，对于自己的数据集的准确率在89%左右，但是Faster-rcnn的速度就是个坑。因此，开始想着用SSD试一下。因为已经有了Faste...

使用SSD检测训练自己的数据

jx232515
2017-12-03 22:39:16
164

上一篇博客讲到如何制作自己的训练数据集,这一篇博客讲讲如何使用SSD训练自己的数据.在训练数据做好后。训练程序为/examples/ssd/ssd_pascal.py，运行之前，我们需要修改相关路径...

caffe-MobileNet-ssd环境搭建及训练自己的数据集模型

caffe-MobileNet-ssd环境搭建及训练自己的数据集模型**********************************************************...

cs_fang_dn
2017-12-13 13:21:18
1437

SSD人脸检测安装：SSD训练自己的数据集

wfei101
2017-12-16 19:43:20
649

第一部分 SSD安装系统：ubuntu 14.04 语言：python ssd项目主页：https://github.com/weiliu89/caffe/tree/ssd 首先，我们把项目代码clo...

SSD 训练自己的数据

yiweibian
2017-10-16 12:54:31
357

忽略细节较多，只做一个简单的说明。系统:ubuntu16.04python2.7opencv3.3.3主要内容1.安装 ssd 2.准备数据 3.训练1.安装 ssd可以根据官网的T...

SSD 安装训练数据集

u014156736
2017-03-13 17:05:57
1229

SSD 安装训练数据集转自 http://lib.csdn.net/article/deeplearning/53859在home目录下，获取SSD的代码，下载完成后有一个caffe...

caffe-SSD训练自己的数据集

u010725283
2018-01-21 20:02:46
1710

本文介绍目标检测中数据集的准备、转换以及使用ssd进行训练的整个过程。内容包括：1，数据集的准备 1）图片的标注 2）制作VOC数据集 3）将VOC数...

个人资料

2014wzy

关注

原创

粉丝

345

喜欢

235

229

等级：

访问量： 66万+

积分： 8106

排名： 3128

优秀博客推荐 1、zouxy09
2、唐巧的技术博客
3、斯坦福大学所有视频课程
4、寒小阳
5、july博客
6、爱可可爱生活老师微博

文章搜索

文章分类

文章存档

2017年10月 (1) 2017年8月 (8) 2017年7月 (35) 2017年6月 (7) 2017年5月 (14)

展开

阅读排行

ImportError: No module named cv2 (20574)
"libcudnn.so.5 cannot open shared object file: No such file or directory" (17740)
SSD框架训练自己的数据集 (15051)
ImportError: No module named caffe 的解决方案 (12299)
caffe(无CUDA,caffe在CPU下运行)+Ubuntu14.0.4详解---（适合于初学者配置） (10937)
caffe及faster-rcnn详细配置安装过程 (10177)
Ubuntu 14.04安装Matlab 2016b教程 (8735)
faster-rcnn 之 RPN网络的结构解析以及RPN代码详解 (7613)
LRN层的实现 (7582)
caffe 有关prototxt文件的设置解读 (7421)

联系我们

请扫描二维码联系客服

400-660-0108

张航瑞

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
SSD和caffe

博客学院下载GitChat论坛问答商城VIP活动招聘ITeye码云CSTO 写博客发Chat ...
复制链接

扫一扫

专栏目录

SSD和caffe

wzy的博客

苦尽甘来

SSD框架训练自己的数据集

`4.1` `c++版本`

`4.2` `python版本`

SSD安装及训练自己的数据集

用SSD框架训练自己的数据集

SSD配置、训练、测试以及应用到自己的数据集

用SSD训练自己的数据集(VOC2007格式)

用SSD训练自己的数据集

ssd训练自己的数据集

目标检测SSD：训练自己的数据集

【SSD】用caffe-ssd框架自带VGG网络训练自己的数据集

SSD训练与数据集方面的要点

深度学习ssd检测模型训练自己的数据集

Tensorflow-SSD测试及训练自己的数据集

windows下SSD训练自己的数据

SSD: Single Shot MultiBox Detector 训练KITTI数据集（2）

SSD训练自己的数据集

使用SSD检测训练自己的数据

caffe-MobileNet-ssd环境搭建及训练自己的数据集模型

SSD人脸检测安装：SSD训练自己的数据集

SSD 训练自己的数据

SSD 安装训练数据集

caffe-SSD训练自己的数据集

2014wzy

联系我们

请扫描二维码联系客服

SSD和caffe

wzy的博客

苦尽甘来

SSD框架训练自己的数据集

4.1 c++版本

4.2 python版本

SSD安装及训练自己的数据集

用SSD框架训练自己的数据集

SSD配置、训练、测试以及应用到自己的数据集

用SSD训练自己的数据集(VOC2007格式)

用SSD训练自己的数据集

ssd训练自己的数据集

目标检测SSD：训练自己的数据集

【SSD】用caffe-ssd框架自带VGG网络训练自己的数据集

SSD训练与数据集方面的要点

深度学习ssd检测模型训练自己的数据集

Tensorflow-SSD测试及训练自己的数据集

windows下SSD训练自己的数据

SSD: Single Shot MultiBox Detector 训练KITTI数据集（2）

SSD训练自己的数据集

使用SSD检测训练自己的数据

caffe-MobileNet-ssd环境搭建及训练自己的数据集模型

SSD人脸检测安装：SSD训练自己的数据集

SSD 训练自己的数据

SSD 安装 训练数据集

caffe-SSD训练自己的数据集

2014wzy

联系我们

请扫描二维码联系客服

“相关推荐”对你有帮助么？

`4.1` `c++版本`

`4.2` `python版本`

SSD 安装训练数据集