t5 学习

最新推荐文章于 2024-04-25 09:52:31 发布

桂花很香,旭很美

最新推荐文章于 2024-04-25 09:52:31 发布

阅读量215

点赞数

分类专栏： NLP

本文链接：https://blog.csdn.net/weixin_40959890/article/details/115766185

版权

NLP 专栏收录该内容

88 篇文章 6 订阅

订阅专栏

#https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
#https://towardsdatascience.com/paraphrase-any-question-with-t5-text-to-text-transfer-transformer-pretrained-model-and-cbb9e35f1555

import torch
from transformers import T5ForConditionalGeneration,T5Tokenizer
#pip install transformers==2.8.0
#https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
#https://towardsdatascience.com/paraphrase-any-question-with-t5-text-to-text-transfer-transformer-pretrained-model-and-cbb9e35f1555

def set_seed(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

set_seed(42)

model = T5ForConditionalGeneration.from_pretrained('ramsrigouthamg/t5_paraphraser')
tokenizer = T5Tokenizer.from_pretrained('ramsrigouthamg/t5_paraphraser')

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print ("device ",device)
model = model.to(device)

sentence = "So I worked as a research associate in the field of research in the pharmaceutical industry. My job was to compare the similarities and differences in the processes of production and registration of generic and new medicines. The study aimed to determine which drugs can be successfully sold. The educational experience has been very fruitful for me. This is due to the fact that I was able to apply the concept of economy to the use of analytical force in the course of my work. In general, the orientation of the elderly and self-employment courses helped me a lot to finish my work on time and effectively."
# sentence = "What are the ingredients required to bake a perfect cake?"
# sentence = "What is the best possible approach to learn aeronautical engineering?"
# sentence = "Do apples taste better than oranges in general?"


text =  "paraphrase: " + sentence + " </s>"


max_len = 256

encoding = tokenizer.encode_plus(text,pad_to_max_length=True, return_tensors="pt")
input_ids, attention_masks = encoding["input_ids"].to(device), encoding["attention_mask"].to(device)


# set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
beam_outputs = model.generate(
    input_ids=input_ids, attention_mask=attention_masks,
    do_sample=True,
    max_length=256,
    top_k=120,
    top_p=0.98,
    early_stopping=True,
    num_return_sequences=10
)


print ("\nOriginal Question ::")
print (sentence)
print ("\n")
print ("Paraphrased Questions :: ")
final_outputs =[]
for beam_output in beam_outputs:
    sent = tokenizer.decode(beam_output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
    if sent.lower() != sentence.lower() and sent not in final_outputs:
        final_outputs.append(sent)

for i, final_output in enumerate(final_outputs):
    print("{}: {}".format(i, final_output))

桂花很香,旭很美

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
t5 学习

#https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html#https://towardsdatascience.com/paraphrase-any-question-with-t5-text-to-text-transfer-transformer-pretrained-model-and-cbb9e35f1555
复制链接

扫一扫