跑通mmf：visualBert实现过程

xiyou__

已于 2022-05-16 11:16:23 修改

阅读量2.5k

点赞数 3

分类专栏：模型复现文章标签： python pytorch nlp

于 2021-12-13 10:14:23 首次发布

本文链接：https://blog.csdn.net/xiyou__/article/details/121792839

版权

模型复现专栏收录该内容

5 篇文章 1 订阅

订阅专栏

使用代码github链接：https://github.com/di-dimitrov/propaganda-techniques-in-memes

该代码相当在mmf-master上展开自己的任务，之前想直接跑通facebook的MMF，但是被两个问题困扰：

环境配置
新数据集的构造

后来被推荐了这套代码，才跑通了visualBert模型。

实现过程

1 将项目下载并存放到服务器

2 Install MMF

Prerequisites - generating image caption features for VisualBERT and ViLBERT:
i. Install MMF according to the instructions here:
https://mmf.readthedocs.io/en/website/notes/installation.html

该链接中给出了安装MMF的两种方法：
1. 在这里插入图片描述
2.

我使用方法1安装总会在后期出现各种环境报错，方法2安装就成功了，所以建议安装不成功的试试方法2安装。

3 Install the following packages

ii. Install the following packages: pip install yacs, opencv-python, cython (if using ‘pip’, any package manager works)

4 Clone vqa-maskrcnn-benchmark repository

iii. Clone vqa-maskrcnn-benchmark repository: https://gitlab.com/vedanuj/vqa-maskrcnn-benchmark
a. Run python setup.py build
b. Run python setup.py develop

5 feature extraction

c. Run the feature extraction script from the following path:
mmf/tools/scripts/features/extract_features_vmb.py

这一步对图片进行处理，先将图片存放到data/datasets/propaganda/defaults/images下，执行

python tools/scripts/features/extract_features_vmb.py

控制台显示如下即为正在处理（这里，“/”后的数字应该和images文件夹下的图片总数一致）：
在这里插入图片描述
如果图片文件夹的布局不寻常，可在extract_features_vmb.py的line273以下稍作修改。该步骤的生成文件默认存储在./output文件夹下。

5 convert the features to a .mdb file

d. After feature extraction is done convert the features to a .mdb file with the following script: mmf/tools/scripts/features/extract_features_vmb.py

此处存在笔误，实际运行程序为：

python tools/scripts/features/lmdb_conversion.py --mode convert --lmdb_path ./save --features_folder ./output

其中，传参内容的含义可见lmdb_conversion.py：

        parser.add_argument(
            "--mode",
            required=True,
            type=str,
            help="Mode can either be `convert` (for conversion of \n"
            + "features to an LMDB file) or `extract` (extract \n"
            + "raw features from a LMDB file)",default="convert"
        )
        parser.add_argument(
            "--lmdb_path", required=True, type=str, help="LMDB file path",default="./save"
        )
        parser.add_argument(
            "--features_folder", required=True, type=str, help="Features folder",default="./output"
        )

该步骤会在./save文件夹下生成data.mdb文件。

6 Rename the .mdb features file to deceptron.lmdb and move it

Rename the .mdb features file to deceptron.lmdb and move it to /root/.cache/torch/mmf/data/datasets/propaganda/defaults/features/

7 Running the models

2.Running the models - open ‘Propaganda_Detection.ipynb’ and run the code inside.

程序默认是多GPU的，指定服务器上的固定节点跑程序：

CUDA_VISIBLE_DEVICES=1 !mmf_run config=./projects/propaganda/configs/visual_bert/direct.yaml \
 datasets=propaganda \
 model=visual_bert

visualBert：对于关键修改位置的备注

参数修改位置

/root/propaganda-techniques-in-memes-main/projects/propaganda/configs/visual_bert/defaults.yaml

修改数据集标签数

/root/propaganda-techniques-in-memes-main/mmf/datasets/builders/propaganda/dataset.py
line81,line147修改标签数量

数据集存放位置

/root/.cache/torch/mmf/data/datasets/propaganda/defaults/annotations/

Bug记录

显示’mmf_run’是不可识别的语法

mmf安装不成功，注意安装过程中出现的bug

ERROR: Cannot uninstall ‘certifi’. It is a distutils installed project and …

直接pip uninstall certifi不成功，执行：

pip install certifi --ignore-installed

参考博客：ERROR: Cannot uninstall ‘certifi‘. It is a distutils installed project and thus we cannot accurately

实现环境备注

torch == 1.7.1+cu110
torchtext == 0.5.0
torchvision == 0.8.2+cu110
pytorch-lightning == 1.4.9

在这里插入图片描述

xiyou__

关注

3
点赞
踩
12

收藏

觉得还不错? 一键收藏
4
评论
跑通mmf：visualBert实现过程

使用代码github链接：https://github.com/di-dimitrov/propaganda-techniques-in-memes该代码相当在mmf-master上展开自己的任务，之前想直接跑通facebook的MMF，但是在环境配置问题上屡屡碰壁，后来被推荐了这套代码，才跑通了visualBert模型。实现过程1 将项目下载并存放到服务器2 Install MMFPrerequisites - generating image caption features for Vi
复制链接

扫一扫