关于Mask_RCNN的工程化应用cpu服务器部署日记(一)

1 技术可行性分析:

      Mask_RCNN是目前表现最好的模型之一,和GAN系列的vidvid 和NLP的BERT在我的经验里都是最好用的模型之一,但是今日碰到了一个问题就是生产环境没有GPU,面临怎么去部署的问题.. 备注:在训练环境中已经训练成功了~~能做到用640张样本达到可接受的分割效果.但是线上使用阿里云服务,GPU太贵了.需要尝试一下cpu,如果能达到1000ms内,就帮公司省点钱呗.毕竟不是实时性要求很高的应用.

      1.1创建生产环境的日志:

     下载了conda, 在.bashrc 添加conda.sh

       conda create -n py3cpu python=3.6.2

       pip install  numpy scipy Pillow cython matplotlib scikit-image keras==2.0.8   h5py  IPython
       pip install opencv-python imgaug

       pip install tensorflow==1.4.0

conda和pip兼容性还是不错的.

       进行model的推断模式.果然报错

Processing 1 images
image                    shape: (512, 512, 3)         min:    0.00000  max:  255.00000  uint8
molded_images            shape: (1, 512, 512, 3)      min: -123.70000  max:  151.10000  float64
image_metas              shape: (1, 15)               min:    0.00000  max:  512.00000  int64
anchors                  shape: (1, 65280, 4)         min:   -0.17712  max:    1.11450  float32
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1323, in _do_call
    return fn(*args)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
    status, run_metadata)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[1] = 65343 is not in [0, 65280)
     [[Node: ROI/Gather_2 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ROI/strided_slice_6, ROI/strided_slice_7)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/Documents/Mask_RCNN-master/samples/dish_food/test_model.py", line 118, in <module>
    ma.test(start_id=0, stop=632)
  File "/home/ubuntu/Documents/Mask_RCNN-master/samples/dish_food/test_model.py", line 90, in test
    results = model.detect(patch_resized_images, verbose=1)
  File "/home/ubuntu/Documents/Mask_RCNN-master/mrcnn/model.py", line 2479, in detect
    self.keras_model.predict([molded_images, image_metas, anchors], verbose=0)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/keras/engine/training.py", line 1713, in predict
    verbose=verbose, steps=steps)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/keras/engine/training.py", line 1269, in _predict_loop
    batch_outs = f(ins_batch)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2273, in __call__
    **self.session_kwargs)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
    options, run_metadata)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[1] = 65343 is not in [0, 65280)
     [[Node: ROI/Gather_2 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ROI/strided_slice_6, ROI/strided_slice_7)]]

Caused by op 'ROI/Gather_2', defined at:
  File "/home/ubuntu/Documents/Mask_RCNN-master/samples/dish_food/test_model.py", line 29, in <module>
    model_dir=MODEL_DIR)
  File "/home/ubuntu/Documents/Mask_RCNN-master/mrcnn/model.py", line 1824, in __init__
    self.keras_model = self.build(mode=mode, config=config)
  File "/home/ubuntu/Documents/Mask_RCNN-master/mrcnn/model.py", line 1948, in build
    config=config)([rpn_class, rpn_bbox, anchors])
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/keras/engine/topology.py", line 602, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/ubuntu/Documents/Mask_RCNN-master/mrcnn/model.py", line 294, in call
    names=["pre_nms_anchors"])
  File "/home/ubuntu/Documents/Mask_RCNN-master/mrcnn/utils.py", line 826, in batch_slice
    output_slice = graph_fn(*inputs_slice)
  File "/home/ubuntu/Documents/Mask_RCNN-master/mrcnn/model.py", line 292, in <lambda>
    pre_nms_anchors = utils.batch_slice([anchors, ix], lambda a, x: tf.gather(a, x),
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 2486, in gather
    params, indices, validate_indices=validate_indices, name=name)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1834, in gather
    validate_indices=validate_indices, name=name)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/home/ubuntu/anaconda3/envs/py3cpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[1] = 65343 is not in [0, 65280)
     [[Node: ROI/Gather_2 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ROI/strided_slice_6, ROI/strided_slice_7)]]

      

 报错的是names=["pre_nms_anchors"]的tensor, 查api,知道了tf.gather是按照indices 获取数组新集合的.但是scores的top 6000的indices应该和anchors的indices的范围应该是一致的才对.

  需要进一步进行tensor的debug.

 

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 5
    评论
mask_rcnn是一种广泛应用于计算机视觉领域的模型,它是在Faster RCNN基础上进行改进得到的。COCO2017是代表微软公司在2017年推出的一个大规模目标检测、分割和关键点检测数据集。 mask_rcnn模型结合了目标检测、实例分割和语义分割的功能,能够检测图像中的多个目标并准确地对每个目标进行分割。在训练过程中,mask_rcnn通过对每个RoI(Region of Interest)应用ROI Align操作,将特征图映射到固定大小的特征图上,然后通过RPN(Region Proposal Network)生成ROIs,并对生成的ROIs进行分类、边界框回归和掩码预测。 COCO2017数据集是一个非常庞大的数据集,包含超过150,000张标记的图像,共80个不同的类别。这些图像涵盖了各种场景,如人、动物、交通工具等。COCO2017数据集在目标检测、分割和关键点检测任务上提供了丰富的标注信息,使得模型能够学习不同类别的目标的特征。 使用mask_rcnn模型在COCO2017数据集上进行训练可以有效地改善图像识别、目标检测和分割任务的性能。通过在训练过程中引入语义分割和掩码预测,mask_rcnn能够更好地理解图像中不同目标的空间关系,从而提供更准确的目标检测和分割结果。此外,COCO2017数据集的丰富标注信息能够帮助模型更好地学习各个类别目标的特征,提高模型在实际场景中的适用性。 总体而言,mask_rcnn模型结合COCO2017数据集可提供更好的目标检测、分割和关键点检测能力,为计算机视觉领域的各种应用提供了强有力的支撑。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值