bert-embedding:如何得到BERT训练的词向量

网上有很多封装好的BERT模型,我们可以下载下来训练我们的句向量或词向量。这些BERT模型已经在大规模语料集上预训练过了,如何通过这些预训练好的模型得到我们需要的词向量呢?
办法就是今天的主角bert-embedding了。

安装

pip install bert-embedding

安装很简单,但是可能出现一些问题。首先环境里必须有TensorFlow,注意版本不要太高,以免出现兼容性问题,我安装的版本是1.13.1。还有就是与bert-embedding兼容的numpy版本只有1.14.6,在安装bert-embedding时它会自动给你装上。如果你在之后又装pandas,新版的pandas安装时会将numpy升级成高版本,那么bert-embedding再运行就会报错。

使用

from bert_embedding import BertEmbedding
 
bert_abstract = """We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
 Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers.
 As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. 
BERT is conceptually simple and empirically powerful. 
It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%."""
 
sentences = bert_abstract.split('\n')
bert_embedding = BertEmbedding()
result = bert_embedding
评论 33
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值