假设我们的数据就两类,train0和train1两个文件夹,这两个文件夹存在多个子目录,并且每个子目录中还存在子目录,这给制作list列表带来了较大的麻烦
我们建立下面shell文件,文件名为creat_list.h
# /usr/bin/env sh
DATA=/media/hjxu/luhaoda/from_xhj/metastasis_data/train0
TxtPath=/media/hjxu/luhaoda/from_xhj/metastasis_data/txt_list
rm -rf $TxtPath/train0.txt
echo "Create test.txt..."
for file_C in ${DATA}/*; do
temp_file=`basename $file_C`
path1=$DATA/${temp_file}
echo $path1
#echo "===================================================================="
echo "===================================================================="
for file_b in ${path1}/*;do
path2=${file_b}
#echo ">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>"
echo $path2
#cd $path1
#echo "---------------------------------------------------------------"
find ${path2} -name *.jpg | sed "s/$/ "0"/">>$TxtPath/train0.txt
#echo "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
echo $file_b
done
done
echo "train Done.."
DATA=/media/hjxu/luhaoda/from_xhj/metastasis_data/train1
TxtPath=/media/hjxu/luhaoda/from_xhj/metastasis_data/txt_list
rm -rf $TxtPath/train1.txt
echo "Create test.txt..."
for file_C in ${DATA}/*; do
temp_file=`basename $file_C`
path1=$DATA/${temp_file}
echo $path1
#echo "===================================================================="
echo "===================================================================="
for file_b in ${path1}/*;do
path2=${file_b}
#echo ">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>"
echo $path2
#cd $path1
#echo "---------------------------------------------------------------"
find ${path2} -name *.jpg | sed "s/$/ "1"/">>$TxtPath/train1.txt
#echo "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
echo $file_b
done
done
echo "train1 Done.."
cat $TxtPath/train1.txt>>$TxtPath/train0.txt
rm -rf $TxtPath/train1.txt
然后我们可以得到train1.txt列表,同理可以得到val数据的列表
上述列表txt文件得到的是图像的绝对路径,也就是从系统的根目录开始
然后根据caffe的convert_imageset函数制作lmdb格式,具体这个函数的用法在这里可见http://blog.csdn.net/hjxu2016/article/details/68064831
我们这边直接贴代码,创建一个sh脚本,create_lmdb.sh
#!/usr/bin/env sh
EXAMPLE=/media/hjxu/LENOVO/metastatic_ndpi_data
DATA=/media/hjxu/luhaoda/from_xhj/metastasis_data/txt_list
TOOLS=/home/hjxu/caffe-master/caffe/build/tools
TRAIN_DATA_ROOT=/
RESIZE=false
if $RESIZE; then
RESIZE_HEIGHT=256
RESIZE_WIDTH=256
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
echo "Creating train lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$TRAIN_DATA_ROOT \
$DATA/train0.txt \
$EXAMPLE/train_lmdb
得到lmdb文件后,就可以直接制作均值了,caffe同样提供compute_image_mean函数,具体用法在这里http://blog.csdn.net/hjxu2016/article/details/68065094
我们直接上代码,简单暴力,创建一个create_mean.sh的shell脚本文件
#!/usr/bin/env sh
EXAMPLE=/media/hjxu/LENOVO/metastatic_ndpi_data
TOOLS=/home/hjxu/caffe-master/caffe/build/tools
$TOOLS/compute_image_mean $EXAMPLE/train_lmdb \
$EXAMPLE/train_ndpi_mean.binaryproto
#$TOOLS/compute_image_mean $EXAMPLE/test_lmdb \
#$EXAMPLE/train_ndpi_mean.binaryproto
echo "Done."
然后就可以欢快得到图像的均值了