BERT Experts from TF-Hub

bert_experts

This colab demonstrates how to:

  • Load BERT models from TensorFlow Hub that have been trained on different tasks including MNLI, SQuAD, and PubMed
  • Use a matching preprocessing model to tokenize raw text and convert it to ids
  • Generate the pooled and sequence output from the token input ids using the loaded model
  • Look at the semantic similarity of the pooled outputs of different sentences

We’ll load the BERT model from TF-Hub, tokenize our sentences using the matching preprocessing model from TF-Hub, then feed in the tokenized sentences to the model. To keep this colab fast and simple, we recommend running on GPU

Semantic similarity:
Now let’s take a look at the pooled_output embeddings of our sentences and compare how similar they are across sentences

import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics import pairwise

import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text as text  # Imports TF ops for preprocessing.

BERT_MODEL = "https://tfhub.dev/google/experts/bert/wiki_books/2"  # @param {type: "string"} ["https://tfhub.dev/google/experts/bert/wiki_books/2", "https://tfhub.dev/google/experts/bert/wiki_books/mnli/2", "https://tfhub.dev/google/experts/bert/wiki_books/qnli/2", "https://tfhub.dev/google/experts/bert/wiki_books/qqp/2", "https://tfhub.dev/google/experts/bert/wiki_books/squad2/2", "https://tfhub.dev/google/experts/bert/wiki_books/sst2/2",  "https://tfhub.dev/google/experts/bert/pubmed/2", "https://tfhub.dev/google/experts/bert/pubmed/squad2/2"]
# Preprocessing must match the model, but all the above use the same.
PREPROCESS_MODEL = "https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3"

sentences = [
    "Here We Go Then, You And I is a 1999 album by Norwegian pop artist Morten Abel. It was Abel's second CD as a solo artist.",
    "The album went straight to number one on the Norwegian album chart, and sold to double platinum.",
    "Among the singles released from the album were the songs \"Be My Lover\" and \"Hard To Stay Awake\".",
    "Riccardo Zegna is an Italian jazz musician.",
    "Rajko Maksimović is a composer, writer, and music pedagogue.",
    "One of the most significant Serbian composers of our time, Maksimović has been and remains active in creating works for different ensembles.",
    "Ceylon spinach is a common name for several plants and may refer to: Basella alba Talinum fruticosum",
    "A solar eclipse occurs when the Moon passes between Earth and the Sun, thereby totally or partly obscuring the image of the Sun for a viewer on Earth.",
    "A partial solar eclipse occurs in the polar regions of the Earth when the center of the Moon's shadow misses the Earth.",
]

preprocess = hub.load(PREPROCESS_MODEL)
bert = hub.load(BERT_MODEL)
inputs = preprocess(sentences)
outputs = bert(inputs)

print("Sentences:")
print(sentences)

print("\nBERT inputs:")
print(inputs)

print("\nPooled embeddings:")
print(outputs["pooled_output"])

print("\nPer token embeddings:")
print(outputs["sequence_output"])


def plot_similarity(features, labels):
    """Plot a similarity matrix of the embeddings."""
    cos_sim = pairwise.cosine_similarity(features)
    sns.set(font_scale=1.2)
    cbar_kws = dict(use_gridspec=False, location="left")
    g = sns.heatmap(
        cos_sim, xticklabels=labels, yticklabels=labels,
        vmin=0, vmax=1, cmap="Blues", cbar_kws=cbar_kws)
    g.tick_params(labelright=True, labelleft=False)
    g.set_yticklabels(labels, rotation=0)
    g.set_title("Semantic Textual Similarity")
    # plt.savefig('results.png')
    plt.show()


plot_similarity(outputs["pooled_output"], sentences)

关于BERT-BiLSTM-CRF模型的图像表示,很遗憾,目前没有找到相关的图像。BERT-BiLSTM-CRF模型是一种结合了BERT预训练模型和BiLSTM-CRF模型的方法,用于中文命名实体识别任务。该模型首先通过BERT模型预处理生成基于上下文信息的词向量,然后将这些词向量输入到BiLSTM-CRF模型中进行进一步的训练和处理。然而,由于BERT-BiLSTM-CRF模型是基于文本的模型,没有明确的图像表示。因此,没有相关的图像来展示BERT-BiLSTM-CRF模型。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [BERT-BiLSTM-CRF-NER:NER任务的Tensorflow解决方案将BiLSTM-CRF模型与Google BERT微调和私有服务器服务结合...](https://download.csdn.net/download/weixin_42179184/18490050)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 33.333333333333336%"] - *2* [BERT-BILSTM-GCN-CRF-for-NER:在原本BERT-BILSTM-CRF上融合GCN和词性标签等做NER任务](https://download.csdn.net/download/weixin_42138525/15682991)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 33.333333333333336%"] - *3* [基于BERT-BiLSTM-CRF模型的中文实体识别](https://download.csdn.net/download/weixin_38675341/18409063)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 33.333333333333336%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值