py-fasterrcnn 关于 numpy 不兼容的问题的问题_numpy库与其他库不兼容怎么办-CSDN博客

本文链接：https://blog.csdn.net/zhaoluruoyan89/article/details/79756288

一、numpy不兼容

安装py-fasterrcnn 的时候，由于numpy 的版本太低导致各种不兼容，因为我装的是anaconda2 ，如果改numpy的版本的话，会导致其他软件的不兼容，比如cv2，它要求的numpy的版本就是大于1.11.0的。所以最好直接该源码。

需要修改的地方如下：

1，1) /home/xxxxx/py-faster-rcnn/lib/roi_data_layer/minibatch.py

把第26行：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)

注意：加了 .astype(np.int)

2，/home/xxxxx/py-faster-rcnn/lib/datasets/ds_utils.py

把第12行：hashes = np.round(boxes * scale).dot(v)
改为：hashes = np.round(boxes * scale).dot(v).astype(np.int)

注意：加了 .astype(np.int)

3， /home/xxxxx/py-faster-rcnn/lib/fast_rcnn/test.py

把第129行： hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)
改为： hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v).astype(np.int)

注意：加了 .astype(np.int)

4，/home/xxx/py-faster-rcnn/lib/rpn/proposal_target_layer.py

把第60行：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)

注意：加了 .astype(np.int)

5， /home/lzx/py-faster-rcnn/lib/rpn/proposal_target_layer.py，123行：

for ind in inds:
        cls = clss[ind]
        start = 4 * cls
        end = start + 4
        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    return bbox_targets, bbox_inside_weights

这里的ind，start，end都是 numpy.int 类型，这种类型的数据不能作为索引，所以必须对其进行强制类型转换，转化结果如下：

for ind in inds:
        ind = int(ind)
        cls = clss[ind]
        start = int(4 * cos)
        end = int(start + 4)
        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    return bbox_targets, bbox_inside_weights

至此搞定。

二、常见问题

错误1：在调用append_flipped_images函数时出现： assert (boxes[:, 2] >= boxes[:, 0]).all()

网上查资料说：出现这个问题主要是自己的数据集标注出错。由于我们使用自己的数据集，可能出现x坐标为0的情况，而pascal_voc数据标注都是从1开始计数的，所以faster rcnn代码里会转化成0-based形式，对Xmin，Xmax，Ymin，Ymax进行-1操作，从而会出现溢出，如果x=0，减1后溢出为65535。更有甚者，标记坐标为负数或者超出图像范围。主要解决方法有：

（1）修改lib/datasets/imdb.py，在boxes[:, 2] = widths[i] - oldx1 - 1后插入：

for b in range(len(boxes)):
    if boxes[b][2]< boxes[b][0]:
        boxes[b][0] = 0

这种方法其实头痛医头，且认为溢出只有可能是 boxes[b][0] ，但后面事实告诉我， boxes[b][2] 也有可能溢出。不推荐。

（2）修改lib/datasets/pascal_voc.py中_load_pascal_annotation函数，该函数是读取pascal_voc格式标注文件的，下面几句中的-1全部去掉（pascal_voc标注是1-based,所以需要-1转化成0-based,如果我们的数据标注是0-based,再-1就可能溢出，所以要去掉）。如果只是0-based的问题（而没有标注为负数或超出图像边界的坐标），这里就应该解决问题了。

[python] view plain copy

x1 = float(bbox.find('xmin').text)#-1
y1 = float(bbox.find('ymin').text)#-1
x2 = float(bbox.find('xmax').text)#-1
y2 = float(bbox.find('ymax').text)#-1

（3）标注文件矩形越界

我执行了上面两步，运行stage 1 RPN, init from ImageNet Model时还是报错。说明可能不仅仅是遇到x=0的情况了，有可能标注本身有错误，比如groundtruth的x1<0或x2>imageWidth。决定先看看到底是那张图像的问题。在lib/datasets/imdb.py的

[python] view plain copy

assert (boxes[:, 2] >= boxes[:, 0]).all()

这句前面加上:

[python] view plain copy

print self.image_index[i]

打印当前处理的图像名，运行之后报错前最后一个打印的图像名就是出问题的图像啦，检测Annotation中该图像的标注是不是有矩形越界的情况。经查，还真有一个目标的x1被标注成了-2。

更正这个标注错误后，正当我觉得终于大功告成之时，依然报错……咬着牙对自己说“我有耐心”。这次报错出现在“Stage 1 Fast R-CNN using RPN proposals, init from ImageNet model”这个阶段，也就是说此时调用append_flipped_images函数处理的是rpn产生的proposals而非标注文件中的groundtruth。不科学啊，groundtruth既然没问题，proposals怎么会溢出呢？结论：没删缓存！把py-faster-rcnn/data/cache中的文件和 py-faster-rcnn/data/VOCdevkit2007/annotations_cache中的文件统统删除。是这篇博客给我的启发。在此之前，我花了些功夫执迷于找标注错误，如果只是想解决问题就没有必要往下看了，但作为分析问题的思路，可以记录一下：

首先我决定看看到底哪个proposal的问题。还是看看是哪张图像的问题，在lib/datasets/imdb.py的

[python] view plain copy

assert (boxes[:, 2] >= boxes[:, 0]).all()

这句前面加上：

[python] view plain copy

print ("num_image:%d"%(i))

然后运行，打印图像在训练集中的索引（这次不需要知道图像名），找到告警前最后打印的那个索引，比如我找到的告警前索引为320，下一步就是看看这个图片上所有的proposal是不是正常，同样地，在告警语句前插入：

[python] view plain copy

if i==320:
print self.image_index[i]
for z in xrange(len(boxes)):
print ('x2:%d x1:%d'%(boxes[z][2],boxes[z][0]))
if boxes[z][2]<boxes[z][0]:
print"here is the bad point!!!"

再次运行后看日志，发现here is the bad point!!!出现在一组“x2=-64491 x1=1011”后，因为我的图像宽度是1044，而1044-65535=-64491，所以其实是x2越界了，因boxes[:, 2] = widths[i] - oldx1 - 1，其实也就是图像反转前对应的oldx1=65534溢出，为什么rpn产生的proposal也会溢出呢？正常情况下，rpn产生的proposal是绝不会超过图像范围的，除非——标准的groundtruth就超出了！而groundtruth如果有问题，stage 1 RPN, init from ImageNet Model这个阶段就应该报错了，所以是一定是缓存的问题。

错误3：pb2.text_format(...)这里报错'module' object has no attribute 'text_format'。

解决方法：在./lib/fast_rcnn/train.py文件里import google.protobuf.text_format。网上有人说把protobuf版本回退到2.5.0，但这样会是caffe编译出现新问题——“cannot import name symbol database”，还需要去github上下对应的缺失文件，所以不建议。