参考:pytorch_transformers使用之获取bert词向量_yecp1的博客-CSDN博客
1、安装pytorch 1.1+
pip install torch
2、安装pytorch-transformers
pip install transformers
3、下载预训练好的模型tensorflow版本,例如bert-base-chinese
哈工大中文bert-wwm:GitHub - ymcui/Chinese-BERT-wwm: Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
gooogle-bert:GitHub - google-research/bert: TensorFlow code and pre-trained models for BERT
4、转换
5、改名
【转换后的结果】
【修改目录名称、修改文件名称、可以删除ckpt文件】
这样就得到了转换好的模型
6、使用
import torch
from pytorch_transformers import BertTokenizer,BertModel
tokenizer = BertTokenizer.from_pretrained('transformer-bert/bert-base-chinese')
model = BertModel.from_pretrained('transformer-bert/bert-base-chinese')
input_ids = torch.tensor(tokenizer.encode("自然语言处理")).unsqueeze(0) # Batch size 1
outputs = model(input_ids)
# last_hidden_states = outputs[0] # The last hidden-state is the first element of the output tuple
sequence_output = outputs[0]
pooled_output = outputs[1]
print(sequence_output)
print(sequence_output.shape) ## 字向量
print(pooled_output)
print(pooled_output.shape) ## 句向量