ERNIE命名实体识别,运行run_sequence_labeling.py训练数据过程中,evaluate报错

问题说明:

        从ERNIE的git上下载的代码、模型和数据,在 windows 上运行,将参数写到文件里,运行run_sequence_labeling.py,可以正常开始训练,但是在训练过程中的evaluate会报错,如下:

[INFO] 2020-02-29 15:58:11,491 [run_sequence_labeling.py:  310]:	validation result of dataset ./data/task_data/msra_ner/dev_new.txt:
Traceback (most recent call last):
  File "E:\PyProject\ERNIE-release-r2.1.0\reader\task_reader.py", line 280, in f
    for i in wrapper():
  File "E:\PyProject\ERNIE-release-r2.1.0\reader\task_reader.py", line 271, in wrapper
    examples, batch_size, phase=phase):
  File "E:\PyProject\ERNIE-release-r2.1.0\reader\task_reader.py", line 232, in _prepare_batch_data
    to_append = len(batch_records) < batch_size
TypeError: unorderable types: int() < NoneType()
[INFO] 2020-02-29 15:58:11,518 [run_sequence_labeling.py:  314]:	[evaluation] f1: 0.000000, precision: 0.000000, recall: 0.000000, elapsed time: 0.025930 s, file: ./data/task_data/msra_ner/dev_new.txt, epoch: 0, steps: 200
[INFO] 2020-02-29 15:58:11,721 [run_sequence_labeling.py:  310]:	validation result of dataset ./data/task_data/msra_ner/test_new.txt:
Traceback (most recent call last):
  File "E:\PyProject\ERNIE-release-r2.1.0\reader\task_reader.py", line 280, in f
    for i in wrapper():
  File "E:\PyProject\ERNIE-release-r2.1.0\reader\task_reader.py", line 271, in wrapper
    examples, batch_size, phase=phase):
  File "E:\PyProject\ERNIE-release-r2.1.0\reader\task_reader.py", line 232, in _prepare_batch_data
    to_append = len(batch_records) < batch_size
TypeError: unorderable types: int() < NoneType()
[INFO] 2020-02-29 15:58:11,751 [run_sequence_labeling.py:  314]:	[evaluation] f1: 0.000000, precision: 0.000000, recall: 0.000000, elapsed time: 0.029920 s, file: ./data/task_data/msra_ner/test_new.txt, epoch: 0, steps: 200

解决过程:

        经过一番检查,在 evaluate_wrapper 方法中家在数据的时候使用的batch_size,为args.predict_batch_size,但是在初始化参数的时候没有设置predict_batch_size。

def evaluate_wrapper(reader, exe, test_prog, test_pyreader, graph_vars,
                     epoch, steps):
    # evaluate dev set
    for ds in args.dev_set.split(','): #single card eval
        test_pyreader.decorate_tensor_provider(
            reader.data_generator(
                ds,
                batch_size=args.predict_batch_size,
                epoch=1,
                dev_count=1,
                shuffle=False))
        log.info("validation result of dataset {}:".format(ds))
        info = evaluate(exe, test_prog, test_pyreader, graph_vars,
                 args.num_labels)
        log.info(info + ', file: {}, epoch: {}, steps: {}'.format(
            ds, epoch, steps))

    解决方法1:将 batch_size=args.predict_batch_size,改成batch_size=args.batch_size。

    解决方法2:在设置参数的时候加上 args.predict_batch_size = 8。

问题解决。

补充:

    在训练好模型以后进行evaluate或者test,需要在参数里指定加载模型的路径,即args.init_checkpoint = "./checkpoints/step_7825",需要将目录指定到保存模型数据的文件夹。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值