BERT+BiLSTM-CRF-NER用于做ner识别

本周五快下班的时候看到别人写了个bert语言模型作为输入,用于做ner识别,后面可以是cnn或者直接是crf层,bert在这里作为word2vec模型的替换着,原始地址https://github.com/macanv/BERT-BiLSTM-CRF-NER,在这里需要注意的是TensorFlow版本需要1.9版本:

整理逻辑还是比较简单,别看谷歌写了那么多代码,实际就是把bert模型替换了原来网络的word2vec部分,然后用google训练好的bert模型对下游任务进行微调,google开源的代码大多数都是用estimator接口,你可以完全不用,具体逻辑是   对原始你的数据转化为tfrecord的形式,dataset or 其他形式读取,然后使用bert模型进行embeding,然后加载google预训练的bert模型,得到embeding之后的tensor进入你的自己的网络,无论是cnn、rnn都可以,以后再也不需要tf.nn.embeding_lookup操作了,word2vec模型最大的缺点是就是在歧义词上效果较差,如苹果,可以是水果、公司、电影,在word2vec模型只要一个向量,无论前后文是什么,但是在bert里面可以是不同的向量,我已经用自己的代码逻辑实现了bert+lstm+crf用于做ner, 



class   BertLstmNer(object):
    def __init__(self,bert_config, is_training, input_ids, input_mask,
                 segment_ids, labels, num_labels, use_one_hot_embeddings,init_checkpoint):
        self.bert_config=bert_config
        self.is_training=is_training
        self.input_ids=input_ids
        self.input_mask=input_mask
        self.segment_ids=segment_ids
        self.labels=labels
        self.num_labels=num_labels
        self.use_one_hot_embeddings=use_one_hot_embeddings
        self.init_checkpoint=init_checkpoint


        model = modeling.BertModel(
            config=self.bert_config,
            is_training=self.is_training,
            input_ids=self.input_ids,
            input_mask=self.input_mask,
            token_type_ids=self.segment_ids,
            use_one_hot_embeddings=self.use_one_hot_embeddings
        )
        # 获取对应的embedding 输入数据[batch_size, seq_length, embedding_size]
        embedding = model.get_sequence_output()
        max_seq_length = embedding.shape[1].value

        used = tf.sign(tf.abs(input_ids))
        lengths = tf.reduce_sum(used, reduction_indices=1)  # [batch_size] 大小的向量,包含了当前batch中的序列长度

        blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=FLAGS.lstm_size, cell_type=FLAGS.cell,
                              num_layers=FLAGS.num_layers,
                              droupout_rate=FLAGS.droupout_rate, initializers=initializers, num_labels=num_labels,
                              seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)

        (self.total_loss, logits, trans, self.pred_ids) = blstm_crf.add_blstm_crf_layer()

        with tf.name_scope("train_op"):
            self.train_op = tf.train.AdamOptimizer().minimize(self.total_loss)

 

 

如果要运行我刚刚贴的GitHub中的的代码的话,tensorflow版本需要至少1.9的版本,用于其中依赖了一些tpu的接口,用的是estimator高阶接口,下面是一些基本操作:

 

 

In [1]: import tensorflow  as tf

 [2]: tf.__version__
Out[2]: '1.9.0'

其他低版本运行会报下面的错误:

AttributeError: module 'tensorflow.contrib.tpu' has no attribute 'InputPipelineConfig'

 

运行之前下载Google开源的中文语言模型:

https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip

 

代码修改为:

else:
    bert_path = '/Users/zhoumeixu/Documents/python/BERT-BiLSTM-CRF-NER/data/chinese_L-12_H-768_A-12/'
    root_path = '/Users/zhoumeixu/Documents/python/BERT-BiLSTM-CRF-NER'

 

 

 

运行命令:

python bert_lstm_ner.py     --task_name="NER" --do_train=True --do_eval=True    --do_predict=True  --data_dir=NERdata      --max_seq_length=128     --train_batch_size=32       --learning_rate=2e-5  --num_train_epochs=3.0 --output_dir=./output/result_dir/ 

 

运行情况会先吧原始文件转化为tfrecord文件,所以要事先建立一个目录output/result_dir

 

运行过程:

INFO:tensorflow:  name = bert/encoder/layer_11/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/encoder/layer_11/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/pooler/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/pooler/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = rnn_layer/bidirectional_rnn/fw/basic_lstm_cell/kernel:0, shape = (896, 512)
INFO:tensorflow:  name = rnn_layer/bidirectional_rnn/fw/basic_lstm_cell/bias:0, shape = (512,)
INFO:tensorflow:  name = rnn_layer/bidirectional_rnn/bw/basic_lstm_cell/kernel:0, shape = (896, 512)
INFO:tensorflow:  name = rnn_layer/bidirectional_rnn/bw/basic_lstm_cell/bias:0, shape = (512,)
INFO:tensorflow:  name = project/hidden/W:0, shape = (256, 128)
INFO:tensorflow:  name = project/hidden/b:0, shape = (128,)
INFO:tensorflow:  name = project/logits/W:0, shape = (128, 11)
INFO:tensorflow:  name = project/logits/b:0, shape = (11,)
INFO:tensorflow:  name = crf_loss/transitions:0, shape = (11, 11)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into ./output/result_dir/model.ckpt.

 

 

最后祝福大家愉快的运行起来,模型准确率更上一层楼

评论 49
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值