NLP之BERT英文阅读理解问答SQuAD 2.0超详细教程

环境

  • linux
  • python 3.6
  • tensorflow 1.12.0

文件准备工作

数据准备工作

建立一个$SQUAD_DIR文件夹,把下载好的文件放到文件夹下。

编码

在bert文件夹下的run_squad.py中comment掉以下几行

if (len(qa["answers"]) != 1) and (not is_impossible):
	raise ValueError(
		"For training, each question should have exactly 1 answer.")

编写运行脚本

在GPU服务器上,你可以这么运行BERT_BASE:
新建一个运行脚本文件名为“run.sh”,将文件内容编辑为:

export SQUAD_DIR=自己建的$SQUAD_DIR路径
export BERT_BASE_DIR=预训练模型所在路径

python run_squad.py \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --do_train=True \
  --train_file=$SQUAD_DIR/train-v2.0.json \
  --do_predict=True \
  --predict_file=$SQUAD_DIR/dev-v2.0.json \
  --train_batch_size=12 \
  --learning_rate=3e-5 \
  --num_train_epochs=2.0 \
  --max_seq_length=328 \
  --doc_stride=128 \
  --output_dir=/tmp/squad_base \
  --version_2_with_negative=True \
  --null_score_diff_threshold=$THRESH


  • $THRESH取-1到-5之间的值
  • Google 设定的max_seq_length参数的default值是328,因为我的训练文本比较长,这里我修改成了512。

运行脚本

chmod +x run.sh
./run.sh
  • chmod +x 的意思就是给文件执行权限
  • 运行的时间可能会有点久,视配置而定。运行结束后,会看到以下结果。
    在这里插入图片描述
  • 生成的结果储存在 /tmp/squad_base/路径下:
    在这里插入图片描述

调参/预测

  • 流程:
  1. 在bert文件夹里新建一个/squad/文件夹,把/tmp/squad_base/路径下的predictions.jsonnull_odds.json放到/squad/里。
  2. 使用以下指令给 dev set做预测&给$THRESH调参
    python $SQUAD_DIR/evaluate-v2.0.py $SQUAD_DIR/dev-v2.0.json ./squad/predictions.json --na-prob-file ./squad/null_odds.json
    
  • 效果(THRESH=-1):
    在这里插入图片描述

踩过的坑

  1. 问题:运行run.sh报错:ValueError: For training, each question should have exactly 1 answer.

    (tensorflow) [isi@localhost bert_squad]$ ./run.sh
    Traceback (most recent call last):
      File "run_squad.py", line 1282, in <module>
        tf.app.run()
      File "/u01/isi/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
        _sys.exit(main(argv))
      File "run_squad.py", line 1158, in main
        input_file=FLAGS.train_file, is_training=True)
      File "run_squad.py", line 267, in read_squad_examples
        "For training, each question should have exactly 1 answer.")
    ValueError: For training, each question should have exactly 1 answer.
    

    解决方法:
    打开run_squad.py,找到265-267行,comment掉以下代码。

    	#          if (len(qa["answers"]) != 1) and (not is_impossible):
    	#            raise ValueError(
    	#                "For training, each question should have exactly 1 answer.")
    
  2. ResourceExhaustedError 没内存了

    
    ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[12,12,512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
             [[node bert/encoder/layer_9/attention/self/Softmax (defined at /u01/isi/jingyiwang/bert_squad/modeling.py:720)  = Softmax[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_9/attention/self/add)]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    
             [[{{node truediv/_4029}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3857_truediv", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    

    解决方法:
    运行过程中遇到内存耗尽造成的花式报错,最后调用了4个GPU成功了。调用GPU只需要在run.sh开头添加

    export CUDA_VISIBLE_DEVICES=1,2,3,4(GPU编号)
    
  3. 使用这段脚本python $SQUAD_DIR/evaluate-v2.0.py $SQUAD_DIR/dev-v2.0.json ./squad/predictions.json --na-prob-file ./squad/null_odds.json给预测null和非空答案的阈值调参的时候,遇到以下错误:

    Traceback (most recent call last):
      File "squad_dir/evaluate-v2.0.py", line 276, in <module>
        main()
      File "squad_dir/evaluate-v2.0.py", line 236, in main
        preds = json.load(f)
      File "/anaconda3/envs/tensorflow/lib/python3.6/json/__init__.py", line 299, in load
        parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
      File "/anaconda3/envs/tensorflow/lib/python3.6/json/__init__.py", line 354, in loads
        return _default_decoder.decode(s)
      File "/anaconda3/envs/tensorflow/lib/python3.6/json/decoder.py", line 339, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/anaconda3/envs/tensorflow/lib/python3.6/json/decoder.py", line 357, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    

    原因:
    忘记删除之间错误版本的prediction.json文件
    解决方法:
    把代码跑通生成的那版prediction.json文件放在squad文件夹里

资料

  • 7
    点赞
  • 30
    收藏
    觉得还不错? 一键收藏
  • 13
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值