tBERT: Topic Models and BERT Joining Forces论文学习

75 篇文章 7 订阅
61 篇文章 2 订阅

一、概览

在这里插入图片描述

二、论文解读

abstract

如何结合topic和预训练模型?
提出了新的架构来做pairwise的语义相似度检测
发现topics极大地帮助解决领域知识的问题

1.introduction

预训练模型建立了新的一个王国
paraphrase的检测提升比较大,semantic similarity detection还是个挑战,例如社区问答项目,需要衡量question-answer对之间的关系,因为高度领域相关,所以还是比较有挑战。
topic models提供了额外领域相关的语义的信息来做语义相似度计算

2. 数据集
3.tBERT
  • 3.1 结构
    bert cls特征+主题模型特征
    主体模型实验:
    LDA + GSDMM
    结合字和文档级别的主体
    每个tokens都放到topic model里面去

就两个句子的每个token都过一下topic model,然后取平均值。
在这里插入图片描述

  • 3.2 主体模型选择
    主题个数:70-90
    alpha值:1或者10
    LDA:
    不适合短文本
    GSDMM:
    基于word和基于documnet都试了
    指标是f1
    在这里插入图片描述

  • 3.3 不同baseline比较
    在这里插入图片描述

感觉没提升多少啊,semeval数据集提升多一点
收敛的更快

在这里插入图片描述

ps:领域知识在机器翻译,命名实体识别提升也比较大。
就这也可以acl吗?看起来好简单啊

english

The task is to predict whether two questions are paraphrases.->任务是预测两个问题是否是转述。
Jensen- Shannon divergence -> 詹森-香农散度

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers) are both advanced natural language processing (NLP) models developed by OpenAI and Google respectively. Although they share some similarities, there are key differences between the two models. 1. Pre-training Objective: GPT is pre-trained using a language modeling objective, where the model is trained to predict the next word in a sequence of words. BERT, on the other hand, is trained using a masked language modeling objective. In this approach, some words in the input sequence are masked, and the model is trained to predict these masked words based on the surrounding context. 2. Transformer Architecture: Both GPT and BERT use the transformer architecture, which is a neural network architecture that is specifically designed for processing sequential data like text. However, GPT uses a unidirectional transformer, which means that it processes the input sequence in a forward direction only. BERT, on the other hand, uses a bidirectional transformer, which allows it to process the input sequence in both forward and backward directions. 3. Fine-tuning: Both models can be fine-tuned on specific NLP tasks, such as text classification, question answering, and text generation. However, GPT is better suited for text generation tasks, while BERT is better suited for tasks that require a deep understanding of the context, such as question answering. 4. Training Data: GPT is trained on a massive corpus of text data, such as web pages, books, and news articles. BERT is trained on a similar corpus of text data, but it also includes labeled data from specific NLP tasks, such as the Stanford Question Answering Dataset (SQuAD). In summary, GPT and BERT are both powerful NLP models, but they have different strengths and weaknesses depending on the task at hand. GPT is better suited for generating coherent and fluent text, while BERT is better suited for tasks that require a deep understanding of the context.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值