caffe---create自己的数据出现的各种bug

目前bug主要是create_imagenet.sh(来源于examples/imagenet)生成lmdb数据时产生的

bug 1  mkdir  *_val_lmdb failed 

这个一般是因为指定路径下已经存在了该文件,导致出现冲突问题,我最开始对于这问题是每次都手动敲码删除该文件,最后发现自己很笨,可以直接加个语句到create_imagenet.sh中:

rm -rf $EXAMPLE/mytask_train_lmdb
rm -rf $EXAMPLE/mytask_val_lmdb

 

bug 2 找不到指定路径下的图片could not open or find file

第一个情况是我在windows cmd下生成的txt标签文件,这里路径是反斜杠,我没有注意到。解决的最好办法就是打开txt文件,将反斜杠替换为斜杠。要么就是在linux下运行make_list.py就不会出现这个问题了。

第二种情况,这个着实困扰了我好久,怎么也搞不懂,路径明明对着了,为啥就不对呢?百思不得其解。。。最后才发现是python里面的转义字符 \t 搞的鬼  在图片名和标签之间的空格用\t表示的,解决这个问题的办法是用 ‘ ’代替了,好了:

	#fout.write('%s\t%d\n'%(image_list[i][0], image_list[i][1]))
        fout.write('%s%s%d\n'%(image_list[i][0], ' ',image_list[i][1]))#space not \t 

正确情况,开始生成lmdb 数据比较大啊 378430图像  比较耗时


代码一

make_list.py

import fnmatch,os
import random
import numpy as np
import argparse

def list_image(root, recursive, exts):
    image_list = []
    if recursive:
        cat = {}
        for path, subdirs, files in os.walk(root,True):
            print path
            for fname in files:
                fpath = os.path.join(path,fname)
                suffix = os.path.splitext(fname)[1].lower()
                if os.path.isfile(fpath) and (suffix in exts):
                    if path not in cat:
                        cat[path] = len(cat)
                    image_list.append((os.path.relpath(fpath, root), cat[path]))
               #	print fpath,cat[path]
    else:
        for fname in os.listdir(root):
            fpath = os.path.join(root, fname)
            suffix = os.path.splitext(fname)[1].lower()
            if os.path.isfile(fpath) and (suffix in exts):
                image_list.append((os.path.relpath(fpath, root), 0))
    return image_list

def write_list(path_out, image_list):
    with open(path_out, 'w') as fout:
        for i in xrange(len(image_list)):
            #fout.write('%d \t %d \t %s\n'%(i, image_list[i][1], image_list[i][0]))
			#fout.write('%s\t%d\n'%(image_list[i][0], image_list[i][1]))
            fout.write('%s%s%d\n'%(image_list[i][0], ' ',image_list[i][1]))#space not \t 
def make_list(prefix_out, root, recursive, exts, num_chunks, train_ratio):
    image_list = list_image(root, recursive, exts)
    random.shuffle(image_list)
    N = len(image_list)
    chunk_size = (N+num_chunks-1)/num_chunks
    for i in xrange(num_chunks):
        chunk = image_list[i*chunk_size:(i+1)*chunk_size]
        if num_chunks > 1:
            str_chunk = '_%d'%i
        else:
            str_chunk = ''
        if train_ratio < 1:
            sep = int(chunk_size*train_ratio)
            write_list(prefix_out+str_chunk+'_train.txt', chunk[:sep])
            write_list(prefix_out+str_chunk+'_val.txt', chunk[sep:])
        else:
            write_list(prefix_out+str_chunk+'.txt', chunk)

def main():
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter,
        description='Make image list files that are\
        required by im2rec')
    parser.add_argument('root', help='path to folder that contain images.')
    parser.add_argument('prefix', help='prefix of output list files.')
    parser.add_argument('--exts', type=list, default=['.bmp','.bmp'],
        help='list of acceptable image extensions.')
    parser.add_argument('--chunks', type=int, default=1, help='number of chunks.')
    parser.add_argument('--train_ratio', type=float, default=1.0,
        help='Percent of images to use for training.')
    parser.add_argument('--recursive', type=bool, default=True,
        help='If true recursively walk through subdirs and assign an unique label\
        to images in each folder. Otherwise only include images in the root folder\
        and give them label 0.')
    args = parser.parse_args()
    
    make_list(args.prefix, args.root, args.recursive,
        args.exts, args.chunks, args.train_ratio)

if __name__ == '__main__':
    main()
代码二

create_imagenet.sh 

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
EXAMPLE=examples/mytask
DATA=/mnt/hgfs/caffe
TOOLS=build/tools
TRAIN_DATA_ROOT=/mnt/hgfs/caffe/train/
VAL_DATA_ROOT=/mnt/hgfs/caffe/val/
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
  RESIZE_HEIGHT=256
  RESIZE_WIDTH=256
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi
if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet validation data is stored."
  exit 1
fi
echo "Creating train lmdb..."
rm -rf $EXAMPLE/mytask_train_lmdb
rm -rf $EXAMPLE/mytask_val_lmdb
GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $DATA/train.txt \
    $EXAMPLE/mytask_train_lmdb
echo "Train lmdb done!"
echo "Creating val lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $VAL_DATA_ROOT \
    $DATA/val.txt \
    $EXAMPLE/mytask_val_lmdb
echo "val lmdb done!"
echo "Done."


  • 7
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值