Caffe-YOLOv3-Windows （一）VOC2007转化为lmdb

qq_46261928

于 2022-08-19 16:11:30 发布

阅读量389

点赞数

文章标签： caffe 深度学习人工智能

本文链接：https://blog.csdn.net/qq_46261928/article/details/126423788

版权

模型网址：GitHub - eric612/Caffe-YOLOv3-Windows: A windows caffe implementation of YOLO detection networkj

数据集制作：

先考虑使用的是官方VOC2007数据集，需要把数据集转化为适用于Caffe的lmdb格式。

我们需要的VOC数据集中的文件夹包含：

其中，一个文件夹是原始图片，另一个文件夹是原始图片对应的包含标注信息的xml文件

接下来进入数据集的制作步骤，准备py文件：

import os
import random

trainval_percent = 0.8
train_percent = 0.8
xmlfilepath = '.VOC2007/Annotations'
txtsavepath = '.VOC2007/ImageSets/Main'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('./VOC2007/ImageSets/Main/trainval.txt', 'w')
ftest = open('./VOC2007/ImageSets/Main/test.txt', 'w')
ftrain = open('./VOCdevkit/VOC2007/ImageSets/Main/train.txt', 'w')
fval = open('./VOC2007/ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

使用时，把路径改成自己的，运行后在相应文件夹生成：

txt文件中是原图不带后缀的名称：

在这个文件路径下创建create_list.sh文件：

#!/bin/bash

bash_dir="$(pwd)"
root_dir=.改路径/
echo $root_dir
sub_dir=ImageSets/Main
for dataset in trainval test
do
  dst_file=$bash_dir/$dataset.txt
  if [ -f $dst_file ]
  then
    rm -f $dst_file
  fi
  for name in VOC2007
  do
    echo "Create list for $name $dataset..."
    dataset_file=$root_dir/$name/$sub_dir/$dataset.txt

    img_file=$bash_dir/$dataset"_img.txt"
    cp $dataset_file $img_file
    sed -i "s/^/$name\/JPEGImages\//g" $img_file
    sed -i "s/$/.jpg/g" $img_file

    label_file=$bash_dir/$dataset"_label.txt"
    cp $dataset_file $label_file
    sed -i "s/^/$name\/Annotations\//g" $label_file
    sed -i "s/$/.xml/g" $label_file

    paste -d' ' $img_file $label_file >> $dst_file

    rm -f $label_file
    rm -f $img_file
  done

  # Generate image name and size infomation.
  if [ $dataset == "test" ]
  then
    改相应的caffe下路径/scripts/build/tools/Release/get_image_size $root_dir $dst_file $bash_dir/$dataset"_name_size.txt"
  fi

  # Shuffle trainval file.
  if [ $dataset == "trainval" ]
  then
    rand_file=$dst_file.random
    cat $dst_file | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);' > $rand_file
    mv $rand_file $dst_file
  fi
done

运行该create_list.sh文件，在文件夹中生成

其中，labelmap_voc.prototxt文件中对应的是VOC数据集类别，网上有，不赘述

txt文件中包含原图路径及对应的xml文件

然后，在VOC2007文件夹的上一级，生成两个.sh文件，分别是train_convert_lmdb.sh和val_convert_lmdb.sh

内容如下：

caffe路径/tools/Release/convert_annoset.exe --anno_type=detection --encode_type=jpg --encoded=true  --shuffle=true --label_map_file=改路径/labelmap_voc.prototxt VOC2007路径的上一级/ 改路径/VOC2007/trainval.txt train_voc_lmdb

caffe路径/tools/Release/convert_annoset.exe --anno_type=detection --encode_type=jpg --encoded=true  --shuffle=true --label_map_file=改路径/labelmap_voc.prototxt VOC2007路径的上一级/ 改路径/VOC2007/test.txt val_voc_lmdb

运行这两个脚本文件，生成：