在ubuntu16.04运行neural baby talk

最新推荐文章于 2024-04-16 10:14:02 发布

cainiaohudi

最新推荐文章于 2024-04-16 10:14:02 发布

阅读量1.1k

点赞数 3

分类专栏：人工智能

本文链接：https://blog.csdn.net/cainiaohudi/article/details/90377325

版权

人工智能专栏收录该内容

3 篇文章 0 订阅

订阅专栏

在ubuntu运行neural baby talk 中

github地址 https://github.com/jiasenlu/NeuralBabyTalk

1.先要移动到/NeuralBabyTalk/pooling/roi_align文件夹下，执行以下命令
sh make.sh
会报出错误1

Traceback (most recent call last):
  File "build.py", line 4, in <module>
    from torch.utils.ffi import create_extension
  File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/__init__.py", line 14, in <module>
    raise ImportError("torch.utils.ffi requires the cffi package")
ImportError: torch.utils.ffi requires the cffi package

原因是少了cffi包，那就装一下

pip install cffi

2.再执行

sh make.sh

会有错误2

error: /home/×××/NeuralBabyTalk/pooling/roi_align/src/roi_align_kernel.cu.o: 没有那个文件或目录
Traceback (most recent call last):
  File "build.py", line 36, in <module>
    ffi.build()
  File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/__init__.py", line 189, in build
    _build_extension(ffi, cffi_wrapper_name, target_dir, verbose)
  File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/__init__.py", line 111, in _build_extension
    outfile = ffi.compile(tmpdir=tmpdir, verbose=verbose, target=libname)
  File "/usr/local/lib/python2.7/dist-packages/cffi/api.py", line 723, in compile
    compiler_verbose=verbose, debug=debug, **kwds)
  File "/usr/local/lib/python2.7/dist-packages/cffi/recompiler.py", line 1526, in recompile
    compiler_verbose, debug)
  File "/usr/local/lib/python2.7/dist-packages/cffi/ffiplatform.py", line 22, in compile
    outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
  File "/usr/local/lib/python2.7/dist-packages/cffi/ffiplatform.py", line 58, in _build
    raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.VerificationError: LinkError: command 'x86_64-linux-gnu-gcc' failed with exit status 1

解决方案：

通过修改make.sh的头文件，在前面加上

export CUDA_PATH=/usr/local/cuda/
export CXXFLAGS="-std=c++11"
export CFLAGS="-std=c99"

export PATH=/usr/local/cuda-8.0/bin $KaTeX parse error: Expected '}', got 'EOF' at end of input: {PATH:+:$ {PATH}}
export CPATH=/usr/local/cuda-8.0/include $KaTeX parse error: Expected '}', got 'EOF' at end of input: {CPATH:+:$ {CPATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64 $KaTeX parse error: Expected '}', got 'EOF' at end of input: \dotsLIBRARY_PATH:+:$ {LD_LIBRARY_PATH}}

这个问题就解决了

3.然后我想去eval robust-coco，根据github中的命令，在/NeuralBabyTalk 文件夹下，执行

python main.py --path_opt cfgs/robust_coco.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/robust_coco_nbt_1024

.vector_cache/glove.6B.zip: 84%|████████▎ | 721M/862M [13:19:57<2:36:56, 15.0kB/s] 卡住不动了。
我觉得有可能是我有些包没装，去看dockerfile，果然我没有装nltk和stanfordcorenlp，装完之后

会有错误3

IOError: [Errno 2] No such file or directory: 'data/robust_coco/dic_coco.json'

解决方法：按照dockerfile里的把所需要的文件下载并解压放到指定地方

# ----------------------------------------------------------------------------
# -- download pretrained imagenet weights for resnet-101
# ----------------------------------------------------------------------------

RUN mkdir /workspace/neuralbabytalk/data/imagenet_weights && \
    cd /workspace/neuralbabytalk/data/imagenet_weights && \
    wget --quiet https://www.dropbox.com/sh/67fc8n6ddo3qp47/AAACkO4QntI0RPvYic5voWHFa/resnet101.pth


# ----------------------------------------------------------------------------
# -- download Karpathy's preprocessed captions datasets and corenlp jar
# ----------------------------------------------------------------------------

RUN cd /workspace/neuralbabytalk/data && \
    wget --quiet http://cs.stanford.edu/people/karpathy/deepimagesent/caption_datasets.zip && \
    unzip caption_datasets.zip && \
    mv dataset_coco.json coco/ && \
    mv dataset_flickr30k.json flickr30k/ && \
    rm caption_datasets.zip dataset_flickr8k.json

RUN cd /workspace/neuralbabytalk/prepro && \
    wget --quiet https://nlp.stanford.edu/software/stanford-corenlp-full-2017-06-09.zip && \
    unzip stanford-corenlp-full-2017-06-09.zip && \
    rm stanford-corenlp-full-2017-06-09.zip

RUN cd /workspace/neuralbabytalk/tools/coco-caption && \
    sh get_stanford_models.sh

# ----------------------------------------------------------------------------
# -- download preprocessed COCO detection output HDF file and pretrained model
# ----------------------------------------------------------------------------

RUN cd /workspace/neuralbabytalk/data/coco && \
    wget --quiet https://www.dropbox.com/s/2gzo4ops5gbjx5h/coco_detection.h5.tar.gz && \
    tar -xzvf coco_detection.h5.tar.gz && \
    rm coco_detection.h5.tar.gz

RUN mkdir -p /workspace/neuralbabytalk/save && \
    cd /workspace/neuralbabytalk/save && \
    wget --quiet https://www.dropbox.com/s/6buajkxm9oed1jp/coco_nbt_1024.tar.gz && \
    tar -xzvf coco_nbt_1024.tar.gz && \
    rm coco_nbt_1024.tar.gz

然后执行

python prepro/prepro_dic_coco.py --input_json data/coco/dataset_coco.json --split robust --output_dic_json data/robust_coco/dic_coco.json --output_cap_json data/robust_coco/cap_coco.json

结果是

  from ._conv import register_converters as _register_converters
parsed input parameters:
{
  "output_dic_json": "data/robust_coco/dic_coco.json", 
  "input_json": "data/coco/dataset_coco.json", 
  "word_count_threshold": 5, 
  "max_length": 16, 
  "output_cap_json": "data/robust_coco/cap_coco.json", 
  "split": "robust"
}
top words and their counts:
(1019785, u'a')
(224758, u'on')
(212689, u'of')
(206178, u'the')
(191793, u'in')
(161216, u'with')
(146755, u'and')
(102390, u'is')
(75957, u'man')
(71183, u'to')
(55190, u'sitting')
(51987, u'an')
(50467, u'two')
(44506, u'at')
(44297, u'standing')
(43707, u'people')
(42776, u'are')
(38867, u'next')
(37898, u'white')
(35372, u'woman')
('total words:', 6454115)
number of bad words: 18443/27929 = 66.04%
number of words in vocab would be 9486
number of UNKs: 32382/6454115 = 0.50%
('max length sentence in raw data: ', 49)
sentence length distribution (count, number of words):
 0:          0   0.000000%
 1:          0   0.000000%
 2:          0   0.000000%
 3:          0   0.000000%
 4:          0   0.000000%
 5:          1   0.000162%
 6:         14   0.002270%
 7:       4851   0.786521%
 8:     101387   16.438461%
 9:     134531   21.812289%
10:     132558   21.492395%
11:      95206   15.436299%
12:      60590   9.823807%
13:      35233   5.712530%
14:      20016   3.245310%
15:      11476   1.860670%
16:       6922   1.122304%
17:       4313   0.699292%
18:       2755   0.446684%
19:       1913   0.310166%
20:       1312   0.212722%
21:        923   0.149651%
22:        665   0.107820%
23:        503   0.081554%
24:        328   0.053181%
25:        258   0.041831%
26:        194   0.031454%
27:        156   0.025293%
28:         97   0.015727%
29:         74   0.011998%
30:         52   0.008431%
31:         65   0.010539%
32:         41   0.006648%
33:         48   0.007783%
34:         43   0.006972%
35:         35   0.005675%
36:         21   0.003405%
37:         24   0.003891%
38:         20   0.003243%
39:         21   0.003405%
40:         19   0.003081%
41:         21   0.003405%
42:         11   0.001783%
43:         19   0.003081%
44:         18   0.002918%
45:         13   0.002108%
46:          6   0.000973%
47:          7   0.001135%
48:          3   0.000486%
49:          4   0.000649%
inserting the special UNK token
('wrote ', 'data/robust_coco/dic_coco.json')
('wrote ', 'data/robust_coco/cap_coco.json')

4.执行

python main.py --path_opt cfgs/robust_coco.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/robust_coco_nbt_1024

有错误4

Traceback (most recent call last):
  File "main.py", line 213, in <module>
    dataset = DataLoader(opt, split='train')
  File "/home/×××/NeuralBabyTalk/misc/dataloader_coco.py", line 112, in __init__
    self.dataloader_hdf = HDFSingleDataset(self.opt.proposal_h5)
  File "/home/×××/NeuralBabyTalk/misc/dataloader_hdf.py", line 59, in __init__
    super().__init__(
TypeError: super() takes at least 1 argument (0 given)

super()为在python3中的方法，而现在是在python2中运行的
改成

super(HDFSingleDataset,self).__init__（
			os.path.dirname(hdf_path),
            shard_names=[os.path.basename(hdf_path)],
            primary_key=primary_key,
            stride=stride
        )

5.还是执行

python main.py --path_opt cfgs/robust_coco.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/robust_coco_nbt_1024

有错误5

Traceback (most recent call last):
  File "main.py", line 267, in <module>
    model = AttModel.TopDownModel(opt)
  File "/home/×××/NeuralBabyTalk/misc/AttModel.py", line 214, in __init__
    self.ccr_core = CascadeCore(opt)
  File "/home/×××/NeuralBabyTalk/misc/AttModel.py", line 246, in __init__
    self.fg_mask = Parameter(opt.fg_mask)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/parameter.py", line 24, in __new__
    return torch.Tensor._make_subclass(cls, data, requires_grad)
RuntimeError: Only Tensors of floating point dtype can require gradients

解决方法安装torch0.4.0。

把COCO_train2014图片放到data/coco/images/train2014文件夹下，
比如data/coco/images/train2014/COCO_train2014_000000398494.jpg

并且把COCO_val2014图片放到data/coco/images/val2014文件夹下，
比如data/coco/images/val2014/COCO_val2014_000000223648.jpg

Namespace(att_feat_size=2048, att_hid_size=512, att_model='topdown', batch_size=20, beam_size=3, cached_tokens='coco-all-idxs', cbs=False, cbs_mode='all', cbs_tag_size=3, checkpoint_path='save/robust_coco_1024', cider_df='corpus', cnn_backend='res101', cnn_learning_rate=1e-05, cnn_optim='adam', cnn_optim_alpha=0.8, cnn_optim_beta=0.999, cnn_weight_decay=0, cuda=True, data_path='data', dataset='coco', decode_noc=False, det_oracle=False, disp_interval=100, drop_prob_lm=0.5, fc_feat_size=2048, finetune_cnn=False, fixed_block=1, grad_clip=0.1, id='', image_crop_size=512, image_path='data/coco/images', image_size=576, inference_only=True, input_dic='data/robust_coco/dic_coco.json', input_encoding_size=512, input_json='data/robust_coco/cap_coco.json', language_eval=1, learning_rate=0.0005, learning_rate_decay_every=3, learning_rate_decay_rate=0.8, learning_rate_decay_start=1, load_best_score=1, losses_log_every=10, mGPUs=False, max_epochs=30, num_layers=1, num_workers=20, optim='adam', optim_alpha=0.9, optim_beta=0.999, optim_epsilon=1e-08, path_opt='cfgs/robust_coco.yml', proposal_h5='data/coco/coco_detection.h5', rnn_size=1024, rnn_type='lstm', scheduled_sampling_increase_every=5, scheduled_sampling_increase_prob=0.05, scheduled_sampling_max_prob=0.25, scheduled_sampling_start=-1, self_critical=False, seq_length=20, seq_per_img=5, start_from='save/robust_coco_nbt_1024', val_every_epoch=3, val_images_use=-1, val_split='test', weight_decay=0)
/usr/local/lib/python2.7/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
DataLoader loading json file:  data/robust_coco/dic_coco.json
vocab size is  9488
DataLoader loading json file:  data/robust_coco/cap_coco.json
loading annotations into memory...
Done (t=18.72s)
creating index...
index created!
loading annotations into memory...
Done (t=10.12s)
creating index...
index created!
assigned 110234 images to split train
DataLoader loading json file:  data/robust_coco/dic_coco.json
vocab size is  9488
DataLoader loading json file:  data/robust_coco/cap_coco.json
loading annotations into memory...
Done (t=22.57s)
creating index...
index created!
loading annotations into memory...
Done (t=4.28s)
creating index...
index created!
assigned 9138 images to split test
Loading pretrained weights from data/imagenet_weights/resnet101.pth
Loading the model save/robust_coco_nbt_1024/model-best.pth...
Use adam as optmization method
/home/×××/NeuralBabyTalk/misc/model.py:520: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  conv_feats, fc_feats = self.cnn(Variable(img.data, volatile=True))
image 223648: a wooden table topped with a wooden table 
image 113588: a man sitting at a desk with a laptop 
image 173350: a dog and a toilet in a room 
image 81922: a large jetliner flying over a city 
image 310391: a green truck parked in the grass near a forest 
image 462341: a clock tower with a sky background 
image 122851: a man riding a motorcycle with a bunch of banana 
image 452684: a glass of wine sitting on a table 
image 350341: a bowl of food on a table 
image 550529: a motorcycle is parked on a wooden shelf 
image 281533: a dog sitting on the floor watching tv 
image 291380: a man sitting in the back seat of a car 
image 560623: a view of a plane in a window 
image 522713: a bench sitting on top of a lush green field 
image 354533: a motorcycle is parked on a dirt field 
image 29913: a fire hydrant on the side of the street 
image 38029: a red truck with a red top is on a street 
image 17756: a boat that is sitting in the grass 
image 155885: a black and white photo of a harbor with many boating 
image 231408: a couple of cats are standing in the grass 
0
100
200
300
400
Total image to be evaluated 9138
loading annotations into memory...
Done (t=1.05s)
creating index...
index created!
using 3020/9138 predictions
Loading and preparing results...
DONE (t=0.06s)
creating index...
index created!
tokenization...
PTBTokenizer tokenized 193335 tokens at 479991.21 tokens per second.
PTBTokenizer tokenized 30305 tokens at 183404.06 tokens per second.
setting up scorers...
computing Bleu score...
{'reflen': 27837, 'guess': [27286, 24266, 21246, 18226], 'testlen': 27286, 'correct': [20678, 11036, 5281, 2558]}
ratio: 0.980206200381
Bleu_1: 0.743
Bleu_2: 0.575
Bleu_3: 0.432
Bleu_4: 0.325
computing METEOR score...
METEOR: 0.251
computing Rouge score...
ROUGE_L: 0.534
computing CIDEr score...
CIDEr: 0.958
computing SPICE score...
Parsing reference captions
Initiating Stanford parsing pipeline
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... 
done [0.6 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.5 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.7 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [2.8 sec].
Threads( StanfordCoreNLP ) [44.74 seconds]
Threads( StanfordCoreNLP ) [19.930 seconds]
Parsing test captions
Threads( StanfordCoreNLP ) [6.653 seconds]
SPICE evaluation took: 1.457 min
SPICE: 0.185