I0301 12:36:30.536449 25955 solver.cpp:245] Train net output #0: loss_bbox = 0.00237033 (* 1 = 0.00237033 loss)
I0301 12:36:30.536458 25955 solver.cpp:245] Train net output #1: loss_cls = 0.0697297 (* 1 = 0.0697297 loss)
I0301 12:36:30.536464 25955 solver.cpp:245] Train net output #2: rpn_cls_loss = 0.0507008 (* 1 = 0.0507008 loss)
I0301 12:36:30.536470 25955 solver.cpp:245] Train net output #3: rpn_loss_bbox = 0.10025 (* 1 = 0.10025 loss)
I0301 12:36:30.536478 25955 sgd_solver.cpp:106] Iteration 260, lr = 0.001
I0301 12:36:34.376888 25955 solver.cpp:229] Iteration 280, loss = nan
I0301 12:36:34.376953 25955 solver.cpp:245] Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I0301 12:36:34.376965 25955 solver.cpp:245] Train net output #1: loss_cls = 0.116323 (* 1 = 0.116323 loss)
I0301 12:36:34.376971 25955 solver.cpp:245] Train net output #2: rpn_cls_loss = 5.43175 (* 1 = 5.43175 loss)
I0301 12:36:34.376977 25955 solver.cpp:245] Train net output #3: rpn_loss_bbox = nan (* 1 = nan loss)
I0301 12:36:34.376984 25955 sgd_solver.cpp:106] Iteration 280, lr = 0.001
./experiments/scripts/faster_rcnn_end2end_12.sh: line 53: 25955 Floating point exception(core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt --weights data/imagenet_models/${NET}.v2.caffemodel --imdb ${TRAIN_IMDB} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end_2012.yml ${EXTRA_ARGS}
参考https://www.cnblogs.com/bile/p/9110727.html
经过分析调试,发现这个问题还是自己的数据集标注越界的问题!!!越界有6种形式:x1<0; x2>width; x2<x1; y1<0; y2>height; y2<y1。不巧的是,源代码作者是针对pascal_voc数据写的,压根就没有考虑标注出错的可能性。发布的代码中只在append_flipped_images函数里 assert (boxes[:, 2] >= boxes[:, 0]).all(),也就是只断言了水平翻转后的坐标x2>=x1,这个地方报错可能是x的标注错误,参考前面的错误2。但是,对于y的标注错误,根本没有检查。
Caffe-Faster RCNN错误:TypeError: ‘numpy.float64’ object cannot be interpreted as an index
参考https://blog.csdn.net/CAU_Ayao/article/details/84679340
在训练Faster RCNN时候出现以下错误:
Solving...
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "./tools/train_faster_rcnn_alt_opt.py", line 197, in train_fast_rcnn
max_iters=max_iters)
File "/home/liguangyao/Programming/caffe-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 161, in train_net
model_paths = sw.train_model(max_iters)
File "/home/liguangyao/Programming/caffe-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 102, in train_model
self.solver.step(1)
File "/home/liguangyao/Programming/caffe-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 144, in forward
blobs = self._get_next_minibatch()
File "/home/liguangyao/Programming/caffe-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 63, in _get_next_minibatch
return get_minibatch(minibatch_db, self._num_classes)
File "/home/liguangyao/Programming/caffe-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 55, in get_minibatch
num_classes)
File "/home/liguangyao/Programming/caffe-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 100, in _sample_rois
fg_inds, size=fg_rois_per_this_image, replace=False)
File "mtrand.pyx", line 1192, in mtrand.RandomState.choice
TypeError: 'numpy.float64' object cannot be interpreted as an index
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
在训练stage1 rpn时,出现’numpy.float64’ object cannot be interpreted as an index 的提示错误,几乎所有的博客中都指出,需要更换numpy 的版本,照做之后,出现ImportError: numpy.core.multiarray failed to import,这个问题又是numpy不匹配造成的,这样就形成了恶性循环,所以,可以考虑从根源上解决’numpy.float64’ object cannot be interpreted as an index的问题。
需要修改一下几个文件:
1. /home/xxx/py-faster-rcnn/lib/roi_data_layer/minibatch.py
将第26行:
fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为:
fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)
第174,175行改为:
for ind in inds:
cls = clss[ind]
start =int( 4 * cls) //改这句
end = int(start + 4) //改这句
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
2. /home/xxx/py-faster-rcnn/lib/datasets/ds_utils.py
将第12行:hashes = np.round(boxes * scale).dot(v)
改为:hashes = np.round(boxes * scale).dot(v).astype(np.int)
- 1
- 2
3. /home/xxx/py-faster-rcnn/lib/fast_rcnn/test.py
将第129行: hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)
改为: hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v).astype(np.int)
- 1
- 2
4. /home/xxx/py-faster-rcnn/lib/rpn/proposal_target_layer.py
将第60行:fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为:fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)
- 1
- 2
解决完上一个问题后,又出现 TypeError: slice indices must be integers or None or have an index method的问题,如果没有改变numpy的版本。
5. 修改 /home/XXX/py-faster-rcnn/lib/rpn/proposal_target_layer.py,转到123行:
for ind in inds:
cls = clss[ind]
start = 4 * cls
end = start + 4
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
return bbox_targets, bbox_inside_weights
- 1
- 2
- 3
- 4
- 5
- 6
- 7
这里的ind,start,end都是 numpy.int 类型,这种类型的数据不能作为索引,所以必须对其进行强制类型转换,转化结果如下:
for ind in inds:
ind = int(ind)
cls = clss[ind]
start = int(4 * cos)
end = int(start + 4)
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
return bbox_targets, bbox_inside_weight
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
再次编译运行,顺利通过~
参考资料href="https://www.cnblogs.com/mengmengmiaomiao/p/9185272.html