目录
三、按需求编译caffe,即Build caffe和pycaffe
四、编译lib文件夹下的setup.py,即Build Cython
一、写在前面的话
首先为什么是(二),因为之前有完成过Faster r-cnn的训练,所以R-FCN称之为(二)。
我个人在把rfcn,pva-faster-rcnn的caffe项目clone下来后发现他们的训练过程是有套路可言的。
1.clone caffe和r-cnn系列的caffe代码
2.按需求编译caffe,即Build caffe和pycaffe
3.编译lib文件夹下的setup.py,即Build Cython
因为之前我找了很多博客大家都是按照Linux下来make,我一直就没弄懂怎么make,可能是因为我没接触过Linux,所以我就按照我自己的想法来,先验证了rfcn的demo,成功了,所以我才想我的想法应该是对的。这也是我自己的方法吧,可能大神看了会觉得麻烦,但对我来说是一个比较容易的方法。
二、clone caffe和r-cnn系列的caffe代码
1.下载caffe:
我所有要用caffe的地方都是用BLVC版的caffe,需要自己添加额外的层,有点麻烦,但胜在思路清晰
C:\Projects> git clone https://github.com/BVLC/caffe.git
C:\Projects> cd caffe
C:\Projects\caffe> git checkout windows
:: Edit any of the options inside build_win.cmd to suit your needs
C:\Projects表示的是想要安装caffe的位置
2. 下载py-R-FCN:
cd ..\r-fcn
git clone https://github.com/Orpine/py-R-FCN.git
三、按需求编译caffe,即Build caffe和pycaffe
1.Build caffe和pycaffe:
a.下载需要的文件:
首先下载需要的文件libraries_v140_x64_py27_1.1.0.tar.bz2
下完依赖包,然后在我们安装的caffe目录(C:\Projects)下,新建一个名为“build”的文件夹,然后再把我们下好的依赖包解压到build文件夹里面。
解压完后,发现它是个libraries文件夹,然后把\libraries\bin,\libraries\lib,\libraries\x64\vc14\bin三个的绝对路径添加到环境变量里面(添加完后记得重启)。
b.进入caffe文件夹:C:\Projects\caffe\scripts\build_win.cmd
用Notepad++打开build_win.cmd
c-1:删除:[自动建立build文件的部分,因为我们已经手动建立了]
if NOT EXIST build mkdir build
pushd build
:: Setup the environement for VS x64
set batch_file=!VS%MSVC_VERSION%0COMNTOOLS!..\..\VC\vcvarsall.bat
call "%batch_file%" amd64
c-2:如果需要cuDNN库,那么在(没删之前的)143行-155行那一块:
:: Configure using cmake and using the caffe-builder dependencies
:: Add -DCUDNN_ROOT=C:/Projects/caffe/cudnn-8.0-windows10-x64-v5.1/cuda ^
:: below to use cuDNN
cmake -G"!CMAKE_GENERATOR!" ^
-DBLAS=Open ^
-DCMAKE_BUILD_TYPE:STRING=%CMAKE_CONFIG% ^
-DBUILD_SHARED_LIBS:BOOL=%CMAKE_BUILD_SHARED_LIBS% ^
-DBUILD_python:BOOL=%BUILD_PYTHON% ^
-DBUILD_python_layer:BOOL=%BUILD_PYTHON_LAYER% ^
-DBUILD_matlab:BOOL=%BUILD_MATLAB% ^
-DCPU_ONLY:BOOL=%CPU_ONLY% ^
-DCUDNN_ROOT=C:\Projects\cuda ^
::C:\Projects是存放cuda的文件目录
-C %cd%\libraries\caffe-builder-config.cmake ^
%~dp0\..
c-3:然后按照配置更改如下内容:
) else (
:: Change the settings here to match your setup
:: Change MSVC_VERSION to 12 to use VS 2013
if NOT DEFINED MSVC_VERSION set MSVC_VERSION=14 <span style="color:#ff6666;">//这里是VS版本,2015-14,2013-12</span>
:: Change to 1 to use Ninja generator (builds much faster)
if NOT DEFINED WITH_NINJA set WITH_NINJA=0 <span style="color:#ff6666;"> //这里一定要改为0</span>
:: Change to 1 to build caffe without CUDA support
if NOT DEFINED CPU_ONLY set CPU_ONLY=0
:: Change to generate CUDA code for one of the following GPU architectures
:: [Fermi Kepler Maxwell Pascal All]
if NOT DEFINED CUDA_ARCH_NAME set CUDA_ARCH_NAME=Auto
:: Change to Debug to build Debug. This is only relevant for the Ninja generator the Visual Studio generator will generate both Debug and Release configs
if NOT DEFINED CMAKE_CONFIG set CMAKE_CONFIG=Release
:: Set to 1 to use NCCL
if NOT DEFINED USE_NCCL set USE_NCCL=0
:: Change to 1 to build a caffe.dll
if NOT DEFINED CMAKE_BUILD_SHARED_LIBS set CMAKE_BUILD_SHARED_LIBS=0
:: Change to 3 if using python 3.5 (only 2.7 and 3.5 are supported)
if NOT DEFINED PYTHON_VERSION set PYTHON_VERSION=3 <span style="color:#ff0000;">//python 版本:我的是Python3.5</span>
:: Change these options for your needs.
if NOT DEFINED BUILD_PYTHON set BUILD_PYTHON=1
if NOT DEFINED BUILD_PYTHON_LAYER set BUILD_PYTHON_LAYER=1
if NOT DEFINED BUILD_MATLAB set BUILD_MATLAB=0
:: If python is on your path leave this alone
if NOT DEFINED PYTHON_EXE set PYTHON_EXE=python
:: Run the tests
if NOT DEFINED RUN_TESTS set RUN_TESTS=0
:: Run lint
if NOT DEFINED RUN_LINT set RUN_LINT=0
:: Build the install target
if NOT DEFINED RUN_INSTALL set RUN_INSTALL=0
)
c-4:到caffe目录下打开命令窗口执行:
C:\Projects\caffe>.\scripts\ build_win.cmd
等待一段时间就编译结束,如果想要在Python中import caffe,那么将C:\projects\caffe\python\caffe这一整个文件复制到C:\Users\Admin\Anaconda3\Lib\site-packages文件中(我anaconda安装在C盘,没有另改位置)
d.为r-fcn需要的caffe添加layers:【我们对比r-fcn的caffe中存放layer的目录可以知道缺哪几个层】
1).需要六个layer:
将这几个文件复制到 C:\Projects\caffe\src\caffe\layers【C:\Projects\---就是我们需要的caffe下载的目录】
2).对应的头文件:复制到对应的C:\Projects\caffe\include\caffe。【C:\Projects\--同样就是我们需要的caffe下载的目录】
3).配置新层,修改caffe.proto文件:打开C:\Projects\caffe\src\caffe\proto下的caffe.proto
按照如下图修改:
添加的信息如下:
optional BoxAnnotatorOHEMParameter box_annotator_ohem_param = 150;
optional PSROIPoolingParameter psroi_pooling_param = 149;
optional ROIPoolingParameter roi_pooling_param = 147;
optional SmoothL1LossParameter smooth_l1_loss_param = 148;
}
message BoxAnnotatorOHEMParameter {
required uint32 roi_per_img = 1; // number of rois for training
optional int32 ignore_label = 2 [default = -1]; // ignore_label in scoring
}
message PSROIPoolingParameter {
required float spatial_scale = 1;
required int32 output_dim = 2; // output channel number
required int32 group_size = 3; // number of groups to encode position-sensitive score maps
}
message ROIPoolingParameter {
// Pad, kernel size, and stride are all given as a single value for equal
// dimensions in height and width or as Y, X pairs.
optional uint32 pooled_h = 1 [default = 0]; // The pooled output height
optional uint32 pooled_w = 2 [default = 0]; // The pooled output width
// Multiplicative spatial scale factor to translate ROI coords from their
// input scale to the scale used when pooling
optional float spatial_scale = 3 [default = 1];
}
message SmoothL1LossParameter {
// SmoothL1Loss(x) =
// 0.5 * (sigma * x) ** 2 -- if x < 1.0 / sigma / sigma
// |x| - 0.5 / sigma / sigma -- otherwise
optional float sigma = 1 [default = 1];
}
// Message that stores parameters used to apply transformation
// to the data layer's data
// Message that stores parameters shared by loss layers
message LossParameter {
// If specified, ignore instances with the given label.
optional int32 ignore_label = 1;
// How to normalize the loss for loss layers that aggregate across batches,
// spatial dimensions, or other dimensions. Currently only implemented in
// SoftmaxWithLoss and SigmoidCrossEntropyLoss layers.
enum NormalizationMode {
// Divide by the number of examples in the batch times spatial dimensions.
// Outputs that receive the ignore label will NOT be ignored in computing
// the normalization factor.
FULL = 0;
// Divide by the total number of output locations that do not take the
// ignore_label. If ignore_label is not set, this behaves like FULL.
VALID = 1;
// Divide by the batch size.
BATCH_SIZE = 2;
// Divide by pre-fixed normalizer
PRE_FIXED = 3;
// Do not normalize the loss.
NONE = 4;
}
// For historical reasons, the default normalization for
// SigmoidCrossEntropyLoss is BATCH_SIZE and *not* VALID.
optional NormalizationMode normalization = 3 [default = VALID];
// Deprecated. Ignored if normalization is specified. If normalization
// is not specified, then setting this to false will be equivalent to
// normalization = BATCH_SIZE to be consistent with previous behavior.
optional bool normalize = 2;
//pre-fixed normalizer
optional float pre_fixed_normalizer = 4 [default = 1];
}
4)重新编译caffe:到caffe目录下打开命令窗口执行:
C:\Projects\caffe>.\scripts\ build_win.cmd
等待一会就编译成功了。
5)将编译好的caffe替换掉r-fcn的caffe文件,如果没有新建一个caffe文件夹就好,把内容复制过去
这样caffe部分就完成了。
四、编译lib文件夹下的setup.py,即Build Cython
编译Cython,有一个博客写的很好【https://blog.csdn.net/chenzhi1992/article/details/53374265】
具体步骤如下:
Windows下不能使用自带的lib,把自带的lib删了,重新下载lib,这里给出git的地址https://github.com/MrGF/py-faster-rcnn-windows。
1)cmd进入lib目录下:输入 python setup.py install完成安装
2)cmd进入lib目录下:输入 python setup_cuda.py install完成安装
运行setup_cuda.py出现的问题,该博客都有解决:
https://blog.csdn.net/jacke121/article/details/79284083
3)将lib->build->utlis目录中的cython_bbox.pyd文件复制到lib->utlis目录下,至此完成Cython编译部分
五、demo_rfcn.py运行
当然就是运行的时候也会出现一些很普遍会出现的问题,
Q1:param_str_没有改成param_str
Q2:如下图:
解决办法:【打开lib->rpn->proposal_layer.py】
#cfg_key = str(self.phase) # either 'TRAIN' or 'TEST'
cfg_key = str('TRAIN' if self.phase == 0 else 'TEST')
最终运行demo_rfcn.py【E:\caffe_rcnn\rfcn\py-R-FCN>.\tools\demo_rfcn.py】,结果如下:
奇怪的是好像demo.py运行出来的结果并没有Faster r-cnn好耶。。
六、训练自己的数据集
1)准备好自己的数据集
唔...大部分的人都是按照源代码里默认的名称,我还是习惯VOC2007所以就还是用VOC2007的数据模式来吧。
为此要修改代码:&R-FCN->experiments->srcripts->你想要的训练方式,以rfcn_end2end_ohem为例:
打开rfcn_end2end_ohem.sh,修改如下部分就好,ITERS可以更改最大迭代次数
2)下载预训练模型:ResNet-50-model.caffemodel或者ResNet-101-model.caffemodel
然后将caffemodel放在&RFCN_ROOT/data/imagenet_models (data下没有该文件夹就新建一个)
3)修改网络模型【以end2end为例,一劳永逸把ohem和不用ohem方法的代码全改了】
下面的cls_num指的是自己数据集的类别数+1(背景)。比如我有1类,+1类背景,cls_num=2
a.修改class-aware/train_ohem.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 2" #cls_num:
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 2" #cls_num
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 98 #cls_num*(score_maps_size^2)#cls_num*(7^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 392 #4*cls_num*(score_maps_size^2) score_maps_size取7
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 2 #cls_num
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 8 #4*cls_num
group_size: 7
}
}
b.修改class-aware/test.prototxt
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 98 #cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 392 #4*cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 2 #cls_num
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 8 #4*cls_num
group_size: 7
}
}
layer {
name: "cls_prob_reshape"
type: "Reshape"
bottom: "cls_prob_pre"
top: "cls_prob"
reshape_param {
shape {
dim: -1
dim: 2 #cls_num
}
}
}
layer {
name: "bbox_pred_reshape"
type: "Reshape"
bottom: "bbox_pred_pre"
top: "bbox_pred"
reshape_param {
shape {
dim: -1
dim: 8 #4*cls_num
}
}
}
c.修改train_agnostic.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 2" #cls_num
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 98 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 2 #cls_num ###
group_size: 7
}
}
d.修改train_agnostic_ohem.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 2" #cls_num ###
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 98 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 2 #cls_num ###
group_size: 7
}
}
e.修改test_agnostic.prototxt
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 98 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 2 #cls_num ###
group_size: 7
}
}
layer {
name: "cls_prob_reshape"
type: "Reshape"
bottom: "cls_prob_pre"
top: "cls_prob"
reshape_param {
shape {
dim: -1
dim: 2 #cls_num ###
}
}
}
4)修改代码:
a.改成自己的数据集标签。
class pascal_voc(imdb):
def __init__(self, image_set, year, devkit_path=None):
imdb.__init__(self, 'voc_' + year + '_' + image_set)
self._year = year
self._image_set = image_set
self._devkit_path = self._get_default_path() if devkit_path is None \
else devkit_path
self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
self._classes = ('__background__', # always index 0
'plane','你的标签2',你的标签3','你的标签4'#小写字母
)
b.修改以下2段内容。否则你的test部分一定会出问题
def _get_voc_results_file_template(self):
# VOCdevkit/results/VOC2007/Main/<comp_id>_det_test_aeroplane.txt
filename = self._get_comp_id() + '_det_' + self._image_set + '_{:s}.txt'
path = os.path.join(
self._devkit_path,
'VOC' + self._year,
'Main',
'{}' + '_test.txt')
return path
def _write_voc_results_file(self, all_boxes):
for cls_ind, cls in enumerate(self.classes):
if cls == '__background__':
continue
print 'Writing {} VOC results file'.format(cls)
filename = self._get_voc_results_file_template().format(cls)
with open(filename, 'w+') as f:
for im_ind, index in enumerate(self.image_index):
dets = all_boxes[cls_ind][im_ind]
if dets == []:
continue
# the VOCdevkit expects 1-based indices
for k in xrange(dets.shape[0]):
f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
format(index, dets[k, -1],
dets[k, 0] + 1, dets[k, 1] + 1,
dets[k, 2] + 1, dets[k, 3] + 1))
c.修改config.py:【这个文件里还可以修改其他数据,比如说训练几次保存model等等】
将训练和测试的proposals改为gt
# Train using these proposals
__C.TRAIN.PROPOSAL_METHOD = 'gt'
# Test using these proposals
__C.TEST.PROPOSAL_METHOD = 'gt'
5)开始训练
1.删除cache文件:每次训练前将data\cache 和 data\VOCdevkit2007\annotations_cache中的文件删除。
【我一直都没有看见过annotations_cache】
2.开始训练:在py-faster-rcnn的根目录下打开git bash输入【需要你下载git】
./experiments/scripts/rfcn_end2end_ohem.sh 0 ResNet-50 pascal_voc
出现这样,训练成功:【迭代好多次之后才记得截图QAQ】
6)自己demo.py的测试
创建自己的demo.py,将你要测试的图片写在im_names里,并把图片放在data\demo这个文件夹下。
修改demo_rfcn.py
CLASSES = ('__background__',
'plane')
NETS = {'ResNet-101': ('ResNet-101',
'resnet101_rfcn_final.caffemodel'),
'ResNet-50': ('ResNet-50',
'resnet50_rfcn_final.caffemodel'),
'myres': ('ResNet-50',
'resnet50_rfcn_ohem_iter_2000.caffemodel')}
......
parser.add_argument('--net', dest='demo_net', help='Network to use [myres]',
choices=NETS.keys(), default='myres')
2.测试自己的model:
将output\里你刚刚训练好的caffemodel复制到data\faster_rcnn_models,运行自己的demo.py文件就可以了
3.结果:ing...待会贴
结果还行
迭代3000次:
迭代8000次: