caffe 目标检测lmdb 数据集制作1

最新推荐文章于 2024-06-24 10:30:00 发布

珞喻小森林

最新推荐文章于 2024-06-24 10:30:00 发布

阅读量1.8k

点赞数 2

分类专栏： caffe ssd lmdb数据集目标检测

本文链接：https://blog.csdn.net/m0_37357063/article/details/86295211

版权

caffe 同时被 3 个专栏收录

4 篇文章 0 订阅

订阅专栏

ssd

1 篇文章 0 订阅

订阅专栏

lmdb数据集

1 篇文章 0 订阅

订阅专栏

4.1 生成 test.txt、trainval.txt

4.2生成test_name_size.txt 文件

4.3生成 trainval_lmdb 和 test_lmdb

0.1 caffe ssd 的github主页：

0.2 Pascal VOC数据集制作caffe目标检测用的lmdb

caffe ssd目标检测训练测用的lmdb数据库文件的制作

1.原材料

分为训练验证集trainval 和测试集合test：

1.所有图像文件.jpg

2.所有标签文件，是按照pascal voc格式的 xml文件，一张图像对应一个xml文件，图片名与标签文件名相同；

2.目录结构

在caffe目录下的data目录下：新建一个文件夹：insulator_detect（数据库的名字也是这个insulator_detect）

insulator_detect文件夹下面新建三个文件夹：Annotations、ImageSets、JPEGImages；

Annotations文件夹下有trainval、test 文件夹，所有训练验证的标签xml文件放在trainval文件夹下，所有测试的标签xml放在test文件夹下面；同时所以的训练验证的标签xml文件和所有测试的标签xml 都拷贝一份放在Annotations文件夹下（方便后续脚本写

JPEGImages文件夹下有trainval、test 文件夹，所有训练验证的图像文件放在trainval文件夹下，所有测试的图像放在test文件夹下面；

ImageSets下面是一些脚本代码文件；

ImageSets 下有：test.txt、trainval.txt、test_name_size.txt、labelmap_insulator.prototxt；

3.生成文件说明

labelmap_insulator.prototxt文件如下：

item{
	name:"none_of_the_above"
	label:0
	display_name:"background"
}
item{
	name:"insulator1"
	label:1
	display_name:"insulator1"
}
item{
	name:"insulator2"
	label:2
	display_name:"insulator2"
}

0：是background

我这里有两个目标：insulator1和insulator2

现在需要用代码文件生成test.txt、trainval.txt、test_name_size.txt三个文件：

test.txt的若干行如下：

insulator_detect/JPEGImages/022656248_K1052896_1155_1_22.jpg insulator_detect/Annotations/022656248_K1052896_1155_1_22.xml
insulator_detect/JPEGImages/022421887_K1050044_1003_1_09.jpg insulator_detect/Annotations/022421887_K1050044_1003_1_09.xml
insulator_detect/JPEGImages/022638387_K1052556_1141_1_22.jpg insulator_detect/Annotations/022638387_K1052556_1141_1_22.xml

test.txt的每一行是一张测试图片的相对路径（一个空格）该图对应的标签xml文件的相对路径；

trainval.txt也是每一行是一张训练测试图片的相对路径（一个空格）该图对应的标签xml文件的相对路径；

图像的数量多少自己划分，比如设置如下比例：trainval.txt ： test.txt = 8 : 2 ;

其实不论存成什么路径，主要是和下面的代码文件的路径匹配上；

test_name_size.txt 文件与test.txt 文件匹配，每一行对于test.txt文件的每一行：

test_name_size.txt 如下：

023016690_K1056793_1327_1_09 4400 6600
022938850_K1056079_1295_1_09 4400 6600
022906708_K1055361_1257_1_22 4400 6600

test_name_size.txt 文件的每一行是一张测试图片的名字（不带后缀）图片的高度图片的宽度

图像文件与标签xml文件的对应关系：

图片文件：

xml标签文件：

xml文件格式：

012500997_K974833_1_1_05.xml ：

<annotation>
	<folder>insulator1</folder>
	<filename>012500997_K974833_1_1_05.jpg</filename>
	<path>D:\CaiShilv_label\insulator1\012500997_K974833_1_1_05.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>6600</width>
		<height>4400</height>
		<depth>1</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>insulator2</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>1244</xmin>
			<ymin>2784</ymin>
			<xmax>1670</xmax>
			<ymax>3621</ymax>
		</bndbox>
	</object>
	<object>
		<name>insulator1</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>3260</xmin>
			<ymin>1537</ymin>
			<xmax>4091</xmax>
			<ymax>2173</ymax>
		</bndbox>
	</object>
	<object>
		<name>insulator1</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>4265</xmin>
			<ymin>2594</ymin>
			<xmax>4918</xmax>
			<ymax>2968</ymax>
		</bndbox>
	</object>
</annotation>

4.生成文件的脚本代码

4.1 生成 test.txt、trainval.txt

4.1然后使用下面的这个python脚本来生成 test.txt、trainval.txt：

generate_trainval_text_txt.py：

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jan  8 10:43:19 2019

@author: yang
"""

#! /usr/bin/python  
# -*- coding:UTF-8 -*-  

#trainval.txt 和 test.txt 文件的每一行就算一张图片的   图片文件名 该张图片对应的XML文件名
  
import os
import glob  
#训练集和测试集路径 
trainval_dir = "/home/yang/caffe_ssd/caffe/data/insulator_detect/JPEGImages/trainval"  
test_dir = "/home/yang/caffe_ssd/caffe/data/insulator_detect/JPEGImages/test"  
  
trainval_img_lists = glob.glob(trainval_dir + '/*.jpg')    #获取trainval中所有.jpg的文件 
trainval_img_names = []                                    #获取名称  
for item in trainval_img_lists:  
    temp1, temp2 = os.path.splitext(os.path.basename(item))  
    trainval_img_names.append(temp1)  
  
test_img_lists = glob.glob(test_dir + '/*.jpg')              #获取test中所有.jpg文件  
test_img_names = []  
for item in test_img_lists:  
    temp1, temp2 = os.path.splitext(os.path.basename(item))  
    test_img_names.append(temp1)  
    
#图片路径和xml路径  
dist_img_dir = "insulator_detect/JPEGImages"              #需要写入txt的trainval和test路径，因为我们在GPEGImges目录下除了有trainval和test文件夹外还有所有图片,所以只用写到PNGImages  
dist_anno_dir = "insulator_detect/Annotations"            #需要写入的xml路径    !!!从caffe跟目录下第一个文件开始写
  


trainval_fd = open("/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/trainval.txt", 'w')    #存到哪里，及存储的名称
test_fd = open("/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/test.txt", 'w')  
   
for item in trainval_img_names:  
    trainval_fd.write(dist_img_dir + '/' + str(item) + '.jpg' + ' ' + dist_anno_dir + '/' + str(item) + '.xml\n')  
    
for item in test_img_names:  
    test_fd.write(dist_img_dir + '/' + str(item) + '.jpg' + ' ' + dist_anno_dir + '/' + str(item) + '.xml\n')

4.2生成test_name_size.txt 文件

用下面的python脚本生成test_name_size.txt 文件

generate_test_size.py：

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jan  8 16:01:17 2019

@author: yang
"""

#! /usr/bin/python
# -*- coding:UTF-8 -*- 
import os
import glob
from PIL import Image #读图

#图的路径
img_dir = "/home/yang/caffe_ssd/caffe/data/insulator_detect/JPEGImages/test"

#获取制定路径下的所有jpg图片的名称
img_lists = glob.glob(img_dir + '/*.jpg')

#在指定路径下创建文件
test_name_size = open('/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/test_name_size_insulator.txt', 'w')

for item in img_lists:
    img = Image.open(item)
    width, height = img.size
    temp1, temp2 = os.path.splitext(os.path.basename(item))                     #basename()返回路径最后的文件名；；；os.path.splitext(“文件路径”)    分离文件名与扩展名；默认返回(fname,fextension)元组，可做分片操作
    test_name_size.write(temp1 + ' ' + str(height) + ' ' + str(width) + '\n')

至此，这test.txt、trainval.txt、test_name_size.txt、labelmap_insulator.prototxt三个文件就生成好了；

4.3生成 trainval_lmdb 和 test_lmdb

下面要利用这四生成对应的lmdb文集 trainval_lmdb 和 test_lmdb

使用这个脚本：create_data.sh

# -*- coding: utf-8 -*-

cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )  
root_dir='/home/yang/caffe_ssd/caffe'   # bash是存放在~cafferoot~/data/VOC0712下，所以向上两级就是cafferoot

cd $root_dir  
  
redo=1  

data_root_dir="/home/yang/caffe_ssd/caffe/data"

txtFileDir="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets"

lmdbFile='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdb'

lmdbLinkDir='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdbLinkDir'

dataset_name="insulator_detect"      #上下相连到最终VOC
mapfile="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/labelmap_insulator.prototxt" #次文件定义了背景层0，以及分类层，下次直接定义labelmap.prototxt的直接路径即可
anno_type="detection"  
db="lmdb"  
min_dim=0  
max_dim=0  
width=0
height=0 
  
extra_cmd="--encode-type=jpg --encoded"  
if [ $redo ]  
then  
  extra_cmd="$extra_cmd --redo"  
fi  
for subset in test trainval 
do   #下面的路径需要根据自己的情况修改，我们的就是这样   
  python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $txtFileDir/$subset.txt $lmdbFile/$subset"_"$db $lmdbLinkDir
done

之前写的博文有个错误点：

特别注意这里面的：

width=0
height=0 

，我一开始设置成了网络的输入尺寸：

width=300
height=300 

；最后生成的lmdb文件是错误的（lmdb文件很小，与图片的总大小不匹配）

所以要设置为：

width=0
height=0

更正：

create_data.sh 脚本当中的

width=0
height=0

可以设置为最终网络训练时的输入图像的尺寸，

比如SSD最后训练需要输入300*300的图像，在制作数据集的时候就把图像尺寸都resize到300*300；

所以，create_data.sh脚本最终如下：

# -*- coding: utf-8 -*-

cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )  
root_dir='/home/yang/caffe_ssd/caffe'   # bash是存放在~cafferoot~/data/VOC0712下，所以向上两级就是cafferoot

cd $root_dir  
  
redo=1  

data_root_dir="/home/yang/caffe_ssd/caffe/data"

txtFileDir="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets"

lmdbFile='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdb'

lmdbLinkDir='/home/yang/caffe_ssd/caffe/data/insulator_detect/lmdbLinkDir'

dataset_name="insulator_detect"      #上下相连到最终VOC
mapfile="/home/yang/caffe_ssd/caffe/data/insulator_detect/ImageSets/labelmap_insulator.prototxt" #次文件定义了背景层0，以及分类层，下次直接定义labelmap.prototxt的直接路径即可
anno_type="detection"  
db="lmdb"  
min_dim=0  
max_dim=0  
width=300
height=300
  
extra_cmd="--encode-type=jpg --encoded"  
if [ $redo ]  
then  
  extra_cmd="$extra_cmd --redo"  
fi  
for subset in test trainval 
do   #下面的路径需要根据自己的情况修改，我们的就是这样   
  python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $txtFileDir/$subset.txt $lmdbFile/$subset"_"$db $lmdbLinkDir
done

图像尺寸resize 到300*300以后，如果原图很大的话，最后制作成的lmdb文件会比所有原图的大小和要小；

因为还有一个图像的压缩编码方案：extra_cmd="--encode-type=jpg --encoded"

一个基本的常识计算：

一张1024*1024*3通道的图像，如果加载进内存，存为矩阵，所占用内存的大小为：

1024*1024*3B=3MB；

但经过jpeg编码，压缩比可达20倍，所有，经编码压缩的图像大小只有：3MB/20=153KB；就比较小了；

这里主要调用了ssd-caffe下的脚本：python $root_dir/scripts/create_annoset.py

create_annoset.py如下：

这个脚本的前面就算参数解析，所以上面的脚本代码与这里要将参数一一对应上就OK；

import argparse
import os
import shutil
import subprocess
import sys

from caffe.proto import caffe_pb2
from google.protobuf import text_format

if __name__ == "__main__":
  parser = argparse.ArgumentParser(description="Create AnnotatedDatum database")
  parser.add_argument("root",
      help="The root directory which contains the images and annotations.")
  parser.add_argument("listfile",
      help="The file which contains image paths and annotation info.")
  parser.add_argument("outdir",
      help="The output directory which stores the database file.")
  parser.add_argument("exampledir",
      help="The directory to store the link of the database files.")
  parser.add_argument("--redo", default = False, action = "store_true",
      help="Recreate the database.")
  parser.add_argument("--anno-type", default = "classification",
      help="The type of annotation {classification, detection}.")
  parser.add_argument("--label-type", default = "xml",
      help="The type of label file format for detection {xml, json, txt}.")
  parser.add_argument("--backend", default = "lmdb",
      help="The backend {lmdb, leveldb} for storing the result")
  parser.add_argument("--check-size", default = False, action = "store_true",
      help="Check that all the datum have the same size.")
  parser.add_argument("--encode-type", default = "",
      help="What type should we encode the image as ('png','jpg',...).")
  parser.add_argument("--encoded", default = False, action = "store_true",
      help="The encoded image will be save in datum.")
  parser.add_argument("--gray", default = False, action = "store_true",
      help="Treat images as grayscale ones.")
  parser.add_argument("--label-map-file", default = "",
      help="A file with LabelMap protobuf message.")
  parser.add_argument("--min-dim", default = 0, type = int,
      help="Minimum dimension images are resized to.")
  parser.add_argument("--max-dim", default = 0, type = int,
      help="Maximum dimension images are resized to.")
  parser.add_argument("--resize-height", default = 0, type = int,
      help="Height images are resized to.")
  parser.add_argument("--resize-width", default = 0, type = int,
      help="Width images are resized to.")
  parser.add_argument("--shuffle", default = False, action = "store_true",
      help="Randomly shuffle the order of images and their labels.")
  parser.add_argument("--check-label", default = False, action = "store_true",
      help="Check that there is no duplicated name/label.")

  args = parser.parse_args()
  root_dir = args.root
  list_file = args.listfile
  out_dir = args.outdir
  example_dir = args.exampledir

  redo = args.redo
  anno_type = args.anno_type
  label_type = args.label_type
  backend = args.backend
  check_size = args.check_size
  encode_type = args.encode_type
  encoded = args.encoded
  gray = args.gray
  label_map_file = args.label_map_file
  min_dim = args.min_dim
  max_dim = args.max_dim
  resize_height = args.resize_height
  resize_width = args.resize_width
  shuffle = args.shuffle
  check_label = args.check_label

  # check if root directory exists
  if not os.path.exists(root_dir):
    print("root directory: {} does not exist".format(root_dir))
    sys.exit()
  # add "/" to root directory if needed
  if root_dir[-1] != "/":
    root_dir += "/"
  # check if list file exists
  if not os.path.exists(list_file):
    print("list file: {} does not exist".format(list_file))
    sys.exit()
  # check list file format is correct
  with open(list_file, "r") as lf:
    for line in lf.readlines():
      img_file, anno = line.strip("\n").split(" ")
      if not os.path.exists(root_dir + img_file):
        print("image file: {} does not exist".format(root_dir + img_file))
      if anno_type == "classification":
        if not anno.isdigit():
          print("annotation: {} is not an integer".format(anno))
      elif anno_type == "detection":
        if not os.path.exists(root_dir + anno):
          print("annofation file: {} does not exist".format(root_dir + anno))
          sys.exit()
      break
  # check if label map file exist
  if anno_type == "detection":
    if not os.path.exists(label_map_file):
      print("label map file: {} does not exist".format(label_map_file))
      sys.exit()
    label_map = caffe_pb2.LabelMap()
    lmf = open(label_map_file, "r")
    try:
      text_format.Merge(str(lmf.read()), label_map)
    except:
      print("Cannot parse label map file: {}".format(label_map_file))
      sys.exit()
  out_parent_dir = os.path.dirname(out_dir)
  if not os.path.exists(out_parent_dir):
    os.makedirs(out_parent_dir)
  if os.path.exists(out_dir) and not redo:
    print("{} already exists and I do not hear redo".format(out_dir))
    sys.exit()
  if os.path.exists(out_dir):
    shutil.rmtree(out_dir)

  # get caffe root directory
  caffe_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))

  if anno_type == "detection":
    cmd = "{}/build/tools/convert_annoset" \
        " --anno_type={}" \
        " --label_type={}" \
        " --label_map_file={}" \
        " --check_label={}" \
        " --min_dim={}" \
        " --max_dim={}" \
        " --resize_height={}" \
        " --resize_width={}" \
        " --backend={}" \
        " --shuffle={}" \
        " --check_size={}" \
        " --encode_type={}" \
        " --encoded={}" \
        " --gray={}" \
        " {} {} {}" \
        .format(caffe_root, anno_type, label_type, label_map_file, check_label,
            min_dim, max_dim, resize_height, resize_width, backend, shuffle,
            check_size, encode_type, encoded, gray, root_dir, list_file, out_dir)
  elif anno_type == "classification":
    cmd = "{}/build/tools/convert_annoset" \
        " --anno_type={}" \
        " --min_dim={}" \
        " --max_dim={}" \
        " --resize_height={}" \
        " --resize_width={}" \
        " --backend={}" \
        " --shuffle={}" \
        " --check_size={}" \
        " --encode_type={}" \
        " --encoded={}" \
        " --gray={}" \
        " {} {} {}" \
        .format(caffe_root, anno_type, min_dim, max_dim, resize_height,
            resize_width, backend, shuffle, check_size, encode_type, encoded,
            gray, root_dir, list_file, out_dir)
  print(cmd)
  process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
  output = process.communicate()[0]

  if not os.path.exists(example_dir):
    os.makedirs(example_dir)
  link_dir = os.path.join(example_dir, os.path.basename(out_dir))
  if os.path.exists(link_dir):
    os.unlink(link_dir)
  os.symlink(out_dir, link_dir)

0.1 caffe ssd 的github主页：

0. caffe ssd 的github主页：https://github.com/weiliu89/caffe/tree/ssd

0.2 Pascal VOC数据集制作caffe目标检测用的lmdb

Download VOC2007 and VOC2012 dataset. By default, we assume the data is stored in $HOME/data/

# Download the data.
cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar

下载解压缩；

然后准备LMDB数据库：

Create the LMDB file：

cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh

目标检测的数据集一般按照pascal voc的格式进行转换。