FCIS+mxnet的大坑(终于ok了)

1 cuda

mxnet最高支持到cu10
4090显卡最低cuda11,所以无法使用,报错:compute_89,gpu版本无法使用

cp了fcis的补充库文件后

使用cpu版本。(mxnet-master),编译完成,用python3安装,python3.7安装了mxnet库,可以初始化矩阵。python2无法安装(python3<3.7,3.8不行)

SyntaxError: invalid syntax

  File "/usr/local/lib/python2.7/dist-packages/mxnet-2.0.0-py2.7.egg/mxnet/symbol/symbol.py", line 80
    return f'<{self.__class__.__name__} group [{name}]>'
                                                       ^
.......

Unsupported Python version
==========================
This version of Requests requires at least Python 3.8, but
you're trying to install it on Python 2.7. To resolve this,
consider upgrading to a supported Python version.

If you can't upgrade your Python version, you'll need to
pin to an older version of Requests (<2.32.0).
error: Setup script exited with 1

于是只能用python3安装了mxnet,
将fcis的python2改为python3,但是运行fcis/demo.py时,报错:

$ python fcis/demo.py
Traceback (most recent call last):
  File "fcis/demo.py", line 30, in <module>
    from core.tester import im_detect, Predictor
  File "/home/FCISpy3/fcis/core/tester.py", line 24, in <module>
    from nms.nms import py_nms_wrapper
  File "/home/FCISpy3/fcis/../lib/nms/nms.py", line 3, in <module>
    from cpu_nms import cpu_nms
ImportError: dynamic module does not define module export function (PyInit_cpu_nms)

PyInit_cpu_nms 问题无法解决:无解答
https://github.com/msracver/FCIS/issues/66
https://github.com/msracver/FCIS/issues/154
GG
所以mxnet 和 fcis 用 python3 也不行,还是得用python2?

但是mxnet oldversion ,cmake 都过不去。。。

2

fcis使用python2
mxnet>1.6只支持python3
mxnet<=1.6支持python2
需要下载老版本mxnet,demo的998378a不可用,存在bug

官方的REDEME进行安装运行,结果在运行Demo的时候出错,提示:
AttributeError: 'module' object has no attribute' 'ChannelOperator

不知道该用哪个mxnet version。
教程:https://blog.csdn.net/xiangxianghehe/article/details/78971383
使用mxnet0.10.1 或1.0.0
但是现在的tag版本中cmake无法通过,error:

CMake Error at CMakeLists.txt:165 (include):
  include could not find requested file:

    mshadow/cmake/Utils.cmake

不启用;
需要git clone with --recursive

CMake Error at CMakeLists.txt:589 (add_library):

  No SOURCES given to target: mxnet

发现:https://github.com/apache/mxnet/issues/10708
需要使用cmake version<3.10 ,大于3.11报错
于是安了3.10.0

3

3.1
mxnet version = 1.0.0

$ git clone --recursive https://github.com/dmlc/mxnet.git
$ git checkout 25720d0
$ git submodule init
$ git submodule update

3.2
cp FCIS file
3.3

$ make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas

error&deal: [https://blog.csdn.net/tcjy1000/article/details/134042714]

src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:141:1: internal compiler error: 段错误
  141 | }  // namespace mxnet
      | ^

$ ulimit -a
$ ulimit -n 65535

3.4 多线程有bug,少用几个cpu

$ make -j4 USE_OPENCV=1 USE_BLAS=openblas

ok
3.5
python install
$ sudo python setup.py install
。error:

==========================
Unsupported Python version
==========================
This version of Requests requires at least Python 3.8, but
you're trying to install it on Python 2.7. To resolve this,
consider upgrading to a supported Python version.

If you can't upgrade your Python version, you'll need to
pin to an older version of Requests (<2.32.0).
error: Setup script exited with 1

问题不大,不安了,直接用mxnet/python 的路径
3.6
$ python fcis/demo.py

result:

ImportError: libcudart.so.12: cannot open shared object file: No such file or directory

3.8
不能直接用makefile,没有cuda(版本),从cmake重新编译makefile
建立build
一样的问题:

Traceback (most recent call last):
  File "fcis/demo.py", line 30, in <module>
    from core.tester import im_detect, Predictor
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/tester.py", line 23, in <module>
    from nms.nms import py_nms_wrapper
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/../lib/nms/nms.py", line 4, in <module>
    from gpu_nms import gpu_nms
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory

mxnet有cpu版本
但是fcis必须要用gpu?
demo.py 要用 gpu_nms 库,没有cuda GG

4
conda python2
pip install Cython== 0.27.3 ( == 0.27.3)
pip install opencv-python3.4.0.14 3.2.0.6 ( == 3.4.0.14)
pip install easydict
1.6
pip install hickle==3.4.9 (3.4.9)
git clone https://github.com/msracver/FCIS.git
sh ./init.sh

按照官方demo
mxnet使用版本:

git clone --recursive https://github.com/dmlc/mxnet.git
git checkout 998378a
git submodule init 
git submodule update
 cp -r FCIS/fcis/operator_cxx/* mxnet/src/operator/contrib/
 cd mxnet
 make -j4 USE_OPENCV=1 USE_BLAS=openblas
 make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
 make -j4 USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1

没有在系统中安装python mxnet
而是修改fcis/demo.py
加:

import sys
sys.path.append('/home/wys/work/farmland/dl/autooutlining/mxnet/python')

然后进入FCIS中

conda python2 $ python fcis/demo.py

结果:

 (outlining27) wys@wys-PC:~/work/farmland/dl/autooutlining/FCIS$ python fcis/demo.py
/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/config/config.py:173: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  exp_config = edict(yaml.load(f))
use mxnet at /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/__init__.pyc
{'BINARY_THRESH': 0.4,
 'CLASS_AGNOSTIC': True,
 'MASK_SIZE': 21,
 'MXNET_VERSION': 'mxnet',
 'SCALES': [(600, 1000)],
 'TEST': {'BATCH_IMAGES': 1,
          'CXX_PROPOSAL': False,
          'HAS_RPN': True,
          'ITER': 2,
          'MASK_MERGE_THRESH': 0.5,
          'MIN_DROP_SIZE': 2,
          'NMS': 0.3,
          'PROPOSAL_MIN_SIZE': 2,
          'PROPOSAL_NMS_THRESH': 0.7,
          'PROPOSAL_POST_NMS_TOP_N': 2000,
          'PROPOSAL_PRE_NMS_TOP_N': 20000,
          'RPN_MIN_SIZE': 2,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 300,
          'RPN_PRE_NMS_TOP_N': 6000,
          'USE_GPU_MASK_MERGE': True,
          'USE_MASK_MERGE': True,
          'test_epoch': 8},
 'TRAIN': {'ASPECT_GROUPING': True,
           'BATCH_IMAGES': 1,
           'BATCH_ROIS': -1,
           'BATCH_ROIS_OHEM': 128,
           'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
           'BBOX_NORMALIZATION_PRECOMPUTED': True,
           'BBOX_REGRESSION_THRESH': 0.5,
           'BBOX_STDS': [0.2, 0.2, 0.5, 0.5],
           'BBOX_WEIGHTS': array([1., 1., 1., 1.]),
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0,
           'BINARY_THRESH': 0.4,
           'CONVNEW3': True,
           'CXX_PROPOSAL': False,
           'ENABLE_OHEM': True,
           'END2END': True,
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'FLIP': True,
           'GAP_SELECT_FROM_ALL': False,
           'IGNORE_GAP': False,
           'LOSS_WEIGHT': [1.0, 10.0, 1.0],
           'RESUME': False,
           'RPN_ALLOWED_BORDER': 0,
           'RPN_BATCH_SIZE': 256,
           'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'RPN_CLOBBER_POSITIVES': False,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 2,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POSITIVE_WEIGHT': -1.0,
           'RPN_POST_NMS_TOP_N': 300,
           'RPN_PRE_NMS_TOP_N': 6000,
           'SHUFFLE': True,
           'begin_epoch': 0,
           'end_epoch': 8,
           'lr': 0.0005,
           'lr_step': '5.33',
           'model_prefix': 'e2e',
           'momentum': 0.9,
           'warmup': True,
           'warmup_lr': 5e-05,
           'warmup_step': 250,
           'wd': 0.0005},
 'dataset': {'NUM_CLASSES': 81,
             'dataset': 'coco',
             'dataset_path': './data/coco',
             'image_set': 'train2014+valminusminival2014',
             'proposal': 'rpn',
             'root_path': './data',
             'test_image_set': 'test-dev2015'},
 'default': {'frequent': 20, 'kvstore': 'device'},
 'gpus': '0',
 'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
             'ANCHOR_SCALES': [4, 8, 16, 32],
             'FIXED_PARAMS': ['conv1',
                              'bn_conv1',
                              'res2',
                              'bn2',
                              'gamma',
                              'beta'],
             'FIXED_PARAMS_SHARED': ['conv1',
                                     'bn_conv1',
                                     'res2',
                                     'bn2',
                                     'res3',
                                     'bn3',
                                     'res4',
                                     'bn4',
                                     'gamma',
                                     'beta'],
             'IMAGE_STRIDE': 0,
             'NUM_ANCHORS': 12,
             'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),
             'RCNN_FEAT_STRIDE': 16,
             'RPN_FEAT_STRIDE': 16,
             'pretrained': './model/pretrained_model/resnet_v1_101',
             'pretrained_epoch': 0},
 'output_path': '../output/fcis',
 'symbol': 'resnet_v1_101_fcis'}
[18:22:41] src/c_api/c_api_ndarray.cc:133: GPU support is disabled. Compile MXNet with USE_CUDA=1 to enable GPU support.
[18:22:41] /home/wys/work/farmland/dl/autooutlining/mxnet/dmlc-core/include/dmlc/./logging.h:304: [18:22:41] src/c_api/c_api_ndarray.cc:390: Operator _zeros is not implemented for GPU.

Stack trace returned 10 entries:
[bt] (0) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3e) [0x7fd5241911de]
[bt] (1) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_Z20ImperativeInvokeImplRKN5mxnet7ContextERKN4nnvm9NodeAttrsEPSt6vectorINS_7NDArrayESaIS8_EESB_+0x32b) [0x7fd525003b0b]
[bt] (2) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(MXImperativeInvoke+0x1ff) [0x7fd525004a0f]
[bt] (3) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0xa052) [0x7fd5295a5052]
[bt] (4) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0x8925) [0x7fd5295a3925]
[bt] (5) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(ffi_call+0xde) [0x7fd5295a406e]
[bt] (6) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fd5295befae]
[bt] (7) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(+0xa253) [0x7fd5295b6253]
[bt] (8) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x52) [0x7fd57e38a822]
[bt] (9) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2954) [0x7fd57e423274]

Traceback (most recent call last):
  File "fcis/demo.py", line 152, in <module>
    main()
  File "fcis/demo.py", line 83, in main
    arg_params=arg_params, aux_params=aux_params)
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/tester.py", line 35, in __init__
    self._mod.bind(provide_data, provide_label, for_training=False)
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/module.py", line 845, in bind
    for_training, inputs_need_grad, force_rebind=False, shared_module=None)
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/module.py", line 402, in bind
    state_names=self._state_names)
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 184, in __init__
    self.bind_exec(data_shapes, label_shapes, shared_group)
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 284, in bind_exec
    shared_group))
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 598, in _bind_ith_exec
    context, self.logger)
  File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 576, in _get_or_reshape
    arg_arr = nd.zeros(arg_shape, context, dtype=arg_type)
  File "/home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/ndarray.py", line 1047, in zeros
    return _internal._zeros(shape=shape, ctx=ctx, dtype=dtype, **kwargs)
  File "<string>", line 15, in _zeros
  File "/home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/_ctypes/ndarray.py", line 72, in _imperative_invoke
    c_array(ctypes.c_char_p, [c_str(str(val)) for val in vals])))
  File "/home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/base.py", line 85, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [18:22:41] src/c_api/c_api_ndarray.cc:390: Operator _zeros is not implemented for GPU.

Stack trace returned 10 entries:
[bt] (0) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3e) [0x7fd5241911de]
[bt] (1) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_Z20ImperativeInvokeImplRKN5mxnet7ContextERKN4nnvm9NodeAttrsEPSt6vectorINS_7NDArrayESaIS8_EESB_+0x32b) [0x7fd525003b0b]
[bt] (2) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(MXImperativeInvoke+0x1ff) [0x7fd525004a0f]
[bt] (3) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0xa052) [0x7fd5295a5052]
[bt] (4) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0x8925) [0x7fd5295a3925]
[bt] (5) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(ffi_call+0xde) [0x7fd5295a406e]
[bt] (6) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fd5295befae]
[bt] (7) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(+0xa253) [0x7fd5295b6253]
[bt] (8) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x52) [0x7fd57e38a822]
[bt] (9) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2954) [0x7fd57e423274]

说明fcis 必须要mxnet的gpu支持 GG

直接用:


5 试试租服务器

找到了还有cuda8的服务器

编译mxnet,和4一样没有问题
测试fcis,conda 使用 python2,
fcis/demo.py 增加 import sys 和 sys.path.append(“…/…/mexnet/python”)路径,就不在系统里安装了

服务器没有图形界面:
下载xvfb 虚拟显示器
sudo apt install xvfb
fcis中:

(python2)~/FCIS/fcis$: xvfb-run python demo.py

and
文件:lib/util/show_mask.py
def show_masks(im, detections, masks, class_names, cfg, scale=1.0, show = False):
默认的show=True改false

ok
终于搞定了

然后自己又增加了一些output 的img和file

demo result:

(py2) root@I1c1d239d9105001488:/home/FCIS/fcis# xvfb-run python demo.py 
/home/FCIS/fcis/config/config.py:173: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  exp_config = edict(yaml.load(f))
use mxnet at ../../mxnet/python/mxnet/__init__.pyc
{'BINARY_THRESH': 0.4,
 'CLASS_AGNOSTIC': True,
 'MASK_SIZE': 21,
 'MXNET_VERSION': 'mxnet',
 'SCALES': [(600, 1000)],
 'TEST': {'BATCH_IMAGES': 1,
          'CXX_PROPOSAL': False,
          'HAS_RPN': True,
          'ITER': 2,
          'MASK_MERGE_THRESH': 0.5,
          'MIN_DROP_SIZE': 2,
          'NMS': 0.3,
          'PROPOSAL_MIN_SIZE': 2,
          'PROPOSAL_NMS_THRESH': 0.7,
          'PROPOSAL_POST_NMS_TOP_N': 2000,
          'PROPOSAL_PRE_NMS_TOP_N': 20000,
          'RPN_MIN_SIZE': 2,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 300,
          'RPN_PRE_NMS_TOP_N': 6000,
          'USE_GPU_MASK_MERGE': True,
          'USE_MASK_MERGE': True,
          'test_epoch': 8},
 'TRAIN': {'ASPECT_GROUPING': True,
           'BATCH_IMAGES': 1,
           'BATCH_ROIS': -1,
           'BATCH_ROIS_OHEM': 128,
           'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
           'BBOX_NORMALIZATION_PRECOMPUTED': True,
           'BBOX_REGRESSION_THRESH': 0.5,
           'BBOX_STDS': [0.2, 0.2, 0.5, 0.5],
           'BBOX_WEIGHTS': array([1., 1., 1., 1.]),
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0,
           'BINARY_THRESH': 0.4,
           'CONVNEW3': True,
           'CXX_PROPOSAL': False,
           'ENABLE_OHEM': True,
           'END2END': True,
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'FLIP': True,
           'GAP_SELECT_FROM_ALL': False,
           'IGNORE_GAP': False,
           'LOSS_WEIGHT': [1.0, 10.0, 1.0],
           'RESUME': False,
           'RPN_ALLOWED_BORDER': 0,
           'RPN_BATCH_SIZE': 256,
           'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'RPN_CLOBBER_POSITIVES': False,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 2,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POSITIVE_WEIGHT': -1.0,
           'RPN_POST_NMS_TOP_N': 300,
           'RPN_PRE_NMS_TOP_N': 6000,
           'SHUFFLE': True,
           'begin_epoch': 0,
           'end_epoch': 8,
           'lr': 0.0005,
           'lr_step': '5.33',
           'model_prefix': 'e2e',
           'momentum': 0.9,
           'warmup': True,
           'warmup_lr': 5e-05,
           'warmup_step': 250,
           'wd': 0.0005},
 'dataset': {'NUM_CLASSES': 81,
             'dataset': 'coco',
             'dataset_path': './data/coco',
             'image_set': 'train2014+valminusminival2014',
             'proposal': 'rpn',
             'root_path': './data',
             'test_image_set': 'test-dev2015'},
 'default': {'frequent': 20, 'kvstore': 'device'},
 'gpus': '0',
 'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
             'ANCHOR_SCALES': [4, 8, 16, 32],
             'FIXED_PARAMS': ['conv1',
                              'bn_conv1',
                              'res2',
                              'bn2',
                              'gamma',
                              'beta'],
             'FIXED_PARAMS_SHARED': ['conv1',
                                     'bn_conv1',
                                     'res2',
                                     'bn2',
                                     'res3',
                                     'bn3',
                                     'res4',
                                     'bn4',
                                     'gamma',
                                     'beta'],
             'IMAGE_STRIDE': 0,
             'NUM_ANCHORS': 12,
             'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),
             'RCNN_FEAT_STRIDE': 16,
             'RPN_FEAT_STRIDE': 16,
             'pretrained': './model/pretrained_model/resnet_v1_101',
             'pretrained_epoch': 0},
 'output_path': '../output/fcis',
 'symbol': 'resnet_v1_101_fcis'}
(426, 640)
testing COCO_test2015_000000000275.jpg 0.1835s
(427, 640)
testing COCO_test2015_000000001412.jpg 0.2058s
(427, 640)
testing COCO_test2015_000000073428.jpg 0.1579s
(428, 640)
testing COCO_test2015_000000393281.jpg 0.1787s
done
(py2) root@I1c1d239d9105001488:/home/FCIS/fcis# 

以后再也不碰远古坑了

复现难度(无意义的难度)和作用不太匹配

几个要点:
1、必须cuda8 , 显卡 titan xp 、1080
2、mxnet version = 378a (按照官方demo的版本)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值