指称表达理解(REC)——MAttNet论文复现,以及踩过的坑!

MAttNet论文复现

一、论文连接以及代码地址

  1. 论文链接:《MAttNet: Modular Attention Network for Referring Expression Comprehension》 in CVPR 2018
  2. 代码链接:https://github.com/lichengunc/MAttNet

二、预先准备(Prerequisites)

  • python2.7
    这个简单!由于组里的服务器默认版本为python3.5,所以你可以用anaconda创建一个python==2.7的虚拟环境:
conda create --name your_env_name python=2.7
  • pytorch0.2 (may not work with 1.0 or higher)
    这个我安装的是pytorch0.4.1版本,这是最匹配CUDA8.0的版本了,下载链接:https://download.pytorch.org/whl/cu80/torch_stable.html
    选择torch-0.4.1-cp27-cp27mu-linux_x86_64.whl下载下来,然后上传到服务器自己的文件夹,然后执行
pip install torch-0.4.1-cp27-cp27mu-linux_x86_64.whl

我这样做是因为直接wget下载太慢了。。。

  • CUDA8.0
    因为组里的服务器的CUDA驱动版本为410,CUDA版本为10.0,
    所以需要在非root用户情况下安装CUDA8.0((꒦_꒦) 大哭有木有!),还好有万能的CSDN网友,搜索一哈你就知道!
    https://blog.csdn.net/daydaydreamer/article/details/107172364
    https://blog.csdn.net/weixin_42262721/article/details/108278214
    按照这两篇文章将cuda10.0换成我们的cuda8.0来操作即可,操作过程一毛一样!
    所以可以到nvidia官网下载对应操作系统的CUDA8.0,我们服务器使用的ubuntu16.0的版本,所以找到对应版本下载即可。建议先下到本地,然后再上传服务器运行,因为直接在服务器下的话,没有VPN可能很慢(是可能哈,别杠我!┭┮﹏┭┮)

三、安装(Installation)

1、Clone the MAttNet repository

git clone --recursive https://github.com/lichengunc/MAttNet

2、Prepare the submodules and associated data

  • Mask R-CNN: Follow the instructions of my mask-faster-rcnn repo, preparing everything needed for pyutils/mask-faster-rcnn. You could use cv/mrcn_detection.ipynb to test if you’ve get Mask R-CNN ready.
    打开mask-faster-rcnn的仓库,按照其中指示操作:
    Preparation
    (1) First of all, clone the code with refer API:
git clone --recursive https://github.com/lichengunc/mask-faster-rcnn

(2) Prepare data:
COCO: We use coco to name COCO’s API as inheritance. Download the annotations and images into data/coco. Note the valminusminival and minival can be downloaded here.

git clone https://github.com/cocodataset/cocoapi data/coco

注意:一般都是下载COCO图像:2014 Train images[83K/13GB]和注释:2014 Train/Val annotations [241MB]即可;valminusminival and minival下载链接库中的这两个数据集,关键要注意这两个数据要解压到data/coco/annotations中,即要和coco的annotations放在一起!在这里插入图片描述最后解压上述下载的压缩包。
(3) REFER: Follow the instructions in REFER to prepare the annotations for RefCOCO, RefCOCO+ and RefCOCOg.

git clone https://github.com/lichengunc/refer data/refer

就按照refer库里面的操作流程:
(i) Run “make” before using the code. It will generate _mask.c and _mask.so in external/ folder. 下载数据之前要先运行make。
(ii) These mask-related codes are copied from mscoco API.
这里要根据链接下载cocoAPI到/data/coco里,注意还要要在PythonAPI文件夹内运行make。
(4) ImageNet Weights: Find the resnet101-caffe download link from this repository, and download it as data/imagenet_weights/res101.pth.
(5) coco_minus_refer: Make the coco_minus_refer annotation, which is to be saved as data/coco/annotations/instances_train_minus_refer_valtest2014.json
使用下述代码生成

python tools/make_coco_minus_refer_instances.py

(6) 最后我们不需要自己训练检测边界框,有现成预训练模型下载
在这里插入图片描述需要注意的坑:mask-rcnn中mask-faster-rcnn/lib没有make!一定要make一下!!

  • REFER API and data: Use the download links of REFER and go to the foloder running make. Follow data/README.md to prepare images and refcoco/refcoco+/refcocog annotations.
    这里就git clone外加make一下refer就好了,不用添加三个数据集!!因为数据都在mask-faster-rcnn的data/refer里了!
  • refer-parser2: Follow the instructions of refer-parser2 to extract the parsed expressions using Vicente’s R1-R7 attributes. Note this sub-module is only used if you want to train the models by yourself.
    首先要把仓库git cloneMAttnet/pyutils
    (1) Requirements:
    自己查询与python2.7对应的版本号,使用pip install下载以下库,不指定版本号的话会出问题!!比如自动更新一些其他库、版本过新不支持一些其他库了。
practnlptools
nltk
corenlp
unidecode

(2) How to use:
我们只需要使用Vicente’s R1-R7 attributes,运行以下对应代码即可。但是贴心的作者又帮你提取好了各个数据集的属性特征!!!

1b) Parse expressions using Vicente's R1-R7 attributes:

python parse_atts.py --dataset refcoco --splitBy unc

各个数据集属性特征:在这里插入图片描述
直接下载到/refer-parse2/cache/parsed_atts:

wget http://bvision.cs.unc.edu/licheng/MattNet/refer-parser2/cache/parsed_atts.zip

然后解压即可。

三、Training

首先需要按照data中的README.md下载数据集,但是README.md写的结构要少了一层images!!!实际应该创建如下目录:

$COCO_PATH
├── images
│   ├── mscoco
│   │   └──images
|	|		└──train2014
│   └── saiaprtc12
├── refcoco
│   ├── instances.json
│   ├── refs(google).p
│   └── refs(unc).p
├── refcoco+
│   ├── instances.json
│   └── refs(unc).p
└── refcocog
    ├── instances.json
    └── refs(google).p

1、(预处理数据)Prepare the training and evaluation data by running tools/prepro.py:

python tools/prepro.py --dataset refcoco --splitBy unc

结果:在/MAttNet/cache/prepro产生两个文件:data.json和data.h5

2、(提取边界框主体特征head_feats和上下文特征ann_feats)Extract features using Mask R-CNN, where the head_feats are used in subject module training and ann_feats is used in relationship module training.

注意此处我没有带上CUDA_VISIBLE_DEVICES=gpu_id,因为容易指定错误。

 python tools/extract_mrcn_head_feats.py --dataset refcoco --splitBy unc
 python tools/extract_mrcn_ann_feats.py --dataset refcoco --splitBy unc

结果:在/MAttNet/cache/feats/refcoco_unc/mrcn/产生一个文件夹res101_coco_minus_refer_notime和一个文件res101_coco_minus_refer_notime_ann_feats.h5

可能产生的报错:(1)CUDA:out of memory。处理方法:在出错主代码外围加上with torch.no_grad():(2)AttributeError:image_orig=im.astype(np.float32,copy=True)。处理方法:图片路径给错了,更改为正确路径!

3、(检测目标、目标掩码、提取边界框位置特征,置信度0.65)Detect objects/masks and extract features . We empirically set the confidence threshold of Mask R-CNN as 0.65.

同理省略CUDA_VISIBLE_DEVICES=gpu_id

python tools/run_detect.py --dataset refcoco --splitBy unc --conf_thresh 0.65
python tools/run_detect_to_mask.py --dataset refcoco --splitBy unc
python tools/extract_mrcn_det_feats.py --dataset refcoco --splitBy unc

结果:在/MAttNet/cache/detections/refcoco_unc/产生两个文件:

res101_coco_minus_refer_notime_dets.json
res101_coco_minus_refer_notime_masks.json

/MAttNet/cache/feats/refcoco_unc/mrcn产生一个文件:

res101_coco_minus_refer_notime_det_feats.json

可能产生的报错:(1)CUDA:out of memory。处理方法:在出错主代码外围加上with torch.no_grad():(2)Dimension out of range。处理方法:在/tools/extract_mrcn_det_feats.py中注释掉fc7=fc7.mean(3).mean(2)因为已经如此操作过了!!

4、Train MAttNet with ground-truth annotation:

终于到了训练阶段了!!!非root用户加上sh,GPU_ID改为0,不然容易出错

sh ./experiments/scripts/train_mattnet.sh 0 refcoco unc

可能产生的报错:json.dump(infos,io)。处理方法:加上default=str:json.dump(infos,io,default=str)

  • 6
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 10
    评论
ESAPI (Enterprise Security API) is a security library that provides a set of functions and utilities to help developers protect their applications from common security vulnerabilities. The `addHeader` method in ESAPI is used to add a security-related HTTP header to the HTTP response. By adding appropriate headers, you can enhance the security of your application and protect it against certain types of attacks. To use the `addHeader` method, you will first need to initialize the ESAPI library and obtain an instance of the `HTTPUtilities` class. Then, you can call the `addHeader` method on the `HTTPUtilities` instance, passing the header name and value as parameters. Here's an example: ```java import org.owasp.esapi.ESAPI; import org.owasp.esapi.HTTPUtilities; public class ExampleApp { public static void main(String[] args) { // Initialize ESAPI ESAPI.initialize(); // Get an instance of HTTPUtilities HTTPUtilities httpUtils = ESAPI.httpUtilities(); // Add a header to the HTTP response httpUtils.addHeader("X-Content-Type-Options", "nosniff"); } } ``` In this example, we added the `X-Content-Type-Options` header with the value `nosniff`. This header is used to prevent browsers from guessing the MIME type of a response, reducing the risk of certain types of attacks. Note that the specific usage of the `addHeader` method may vary depending on the version and configuration of ESAPI you are using. Please refer to the documentation or resources specific to your version for more details.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值