caffe---create自己的数据出现的各种bug

最新推荐文章于 2021-02-11 02:49:44 发布

置顶萌面女xia

最新推荐文章于 2021-02-11 02:49:44 发布

阅读量9.7k

点赞数 7

分类专栏： caffe学习笔记深度学习文章标签：数据 cnn linux

本文链接：https://blog.csdn.net/dcxhun3/article/details/51966921

版权

深度学习同时被 2 个专栏收录

44 篇文章 4 订阅

订阅专栏

caffe学习笔记

22 篇文章 0 订阅

订阅专栏

目前bug主要是create_imagenet.sh（来源于examples/imagenet）生成lmdb数据时产生的

bug 1 mkdir *_val_lmdb failed

这个一般是因为指定路径下已经存在了该文件，导致出现冲突问题，我最开始对于这问题是每次都手动敲码删除该文件，最后发现自己很笨，可以直接加个语句到create_imagenet.sh中：

rm -rf $EXAMPLE/mytask_train_lmdb
rm -rf $EXAMPLE/mytask_val_lmdb

bug 2 找不到指定路径下的图片could not open or find file

第一个情况是我在windows cmd下生成的txt标签文件，这里路径是反斜杠，我没有注意到。解决的最好办法就是打开txt文件，将反斜杠替换为斜杠。要么就是在linux下运行make_list.py就不会出现这个问题了。

第二种情况，这个着实困扰了我好久，怎么也搞不懂，路径明明对着了，为啥就不对呢？百思不得其解。。。最后才发现是python里面的转义字符 \t 搞的鬼在图片名和标签之间的空格用\t表示的，解决这个问题的办法是用 ‘ ’代替了，好了：

	#fout.write('%s\t%d\n'%(image_list[i][0], image_list[i][1]))
        fout.write('%s%s%d\n'%(image_list[i][0], ' ',image_list[i][1]))#space not \t

正确情况，开始生成lmdb 数据比较大啊 378430图像比较耗时

代码一

make_list.py

import fnmatch,os
import random
import numpy as np
import argparse

def list_image(root, recursive, exts):
    image_list = []
    if recursive:
        cat = {}
        for path, subdirs, files in os.walk(root,True):
            print path
            for fname in files:
                fpath = os.path.join(path,fname)
                suffix = os.path.splitext(fname)[1].lower()
                if os.path.isfile(fpath) and (suffix in exts):
                    if path not in cat:
                        cat[path] = len(cat)
                    image_list.append((os.path.relpath(fpath, root), cat[path]))
               #	print fpath,cat[path]
    else:
        for fname in os.listdir(root):
            fpath = os.path.join(root, fname)
            suffix = os.path.splitext(fname)[1].lower()
            if os.path.isfile(fpath) and (suffix in exts):
                image_list.append((os.path.relpath(fpath, root), 0))
    return image_list

def write_list(path_out, image_list):
    with open(path_out, 'w') as fout:
        for i in xrange(len(image_list)):
            #fout.write('%d \t %d \t %s\n'%(i, image_list[i][1], image_list[i][0]))
			#fout.write('%s\t%d\n'%(image_list[i][0], image_list[i][1]))
            fout.write('%s%s%d\n'%(image_list[i][0], ' ',image_list[i][1]))#space not \t 
def make_list(prefix_out, root, recursive, exts, num_chunks, train_ratio):
    image_list = list_image(root, recursive, exts)
    random.shuffle(image_list)
    N = len(image_list)
    chunk_size = (N+num_chunks-1)/num_chunks
    for i in xrange(num_chunks):
        chunk = image_list[i*chunk_size:(i+1)*chunk_size]
        if num_chunks > 1:
            str_chunk = '_%d'%i
        else:
            str_chunk = ''
        if train_ratio < 1:
            sep = int(chunk_size*train_ratio)
            write_list(prefix_out+str_chunk+'_train.txt', chunk[:sep])
            write_list(prefix_out+str_chunk+'_val.txt', chunk[sep:])
        else:
            write_list(prefix_out+str_chunk+'.txt', chunk)

def main():
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
        description='Make image list files that are\
        required by im2rec')
    parser.add_argument('root', help='path to folder that contain images.')
    parser.add_argument('prefix', help='prefix of output list files.')
    parser.add_argument('--exts', type=list, default=['.bmp','.bmp'],
        help='list of acceptable image extensions.')
    parser.add_argument('--chunks', type=int, default=1, help='number of chunks.')
    parser.add_argument('--train_ratio', type=float, default=1.0,
        help='Percent of images to use for training.')
    parser.add_argument('--recursive', type=bool, default=True,
        help='If true recursively walk through subdirs and assign an unique label\
        to images in each folder. Otherwise only include images in the root folder\
        and give them label 0.')
    args = parser.parse_args()
    
    make_list(args.prefix, args.root, args.recursive,
        args.exts, args.chunks, args.train_ratio)

if __name__ == '__main__':
    main()

代码二

create_imagenet.sh

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
EXAMPLE=examples/mytask
DATA=/mnt/hgfs/caffe
TOOLS=build/tools
TRAIN_DATA_ROOT=/mnt/hgfs/caffe/train/
VAL_DATA_ROOT=/mnt/hgfs/caffe/val/
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
  RESIZE_HEIGHT=256
  RESIZE_WIDTH=256
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi
if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet validation data is stored."
  exit 1
fi
echo "Creating train lmdb..."
rm -rf $EXAMPLE/mytask_train_lmdb
rm -rf $EXAMPLE/mytask_val_lmdb
GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $DATA/train.txt \
    $EXAMPLE/mytask_train_lmdb
echo "Train lmdb done!"
echo "Creating val lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $VAL_DATA_ROOT \
    $DATA/val.txt \
    $EXAMPLE/mytask_val_lmdb
echo "val lmdb done!"
echo "Done."