翻译的两种实现方式——基于transformers

最新推荐文章于 2024-07-27 17:50:57 发布

clearlove100

最新推荐文章于 2024-07-27 17:50:57 发布

阅读量1.7k

点赞数

分类专栏： nlp 文章标签：自然语言处理机器学习神经网络

本文链接：https://blog.csdn.net/clearlove100/article/details/120270211

版权

nlp 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

1.基于pipeline

(可加参数 device,默认为-1,使用cpu，如果非负，则表示指定哪块gpu)

from transformers import (
     AutoTokenizer,
     AutoModelForSeq2SeqLM,
     pipeline
)

text = "从时间上看，中国空间站的建造比国际空间站晚20多年。"

tokenizer = AutoTokenizer.from_pretrained("./Helsinki-NLP/opus-mt-zh-en")
model = AutoModelForSeq2SeqLM.from_pretrained("./Helsinki-NLP/opus-mt-zh-en")

tokenizer_back_translate = AutoTokenizer.from_pretrained("./Helsinki-NLP/opus-mt-en-zh")
model_back_translate = AutoModelForSeq2SeqLM.from_pretrained("./Helsinki-NLP/opus-mt-en-zh")

zh2en = pipeline("translation_zh_to_en", model=model, tokenizer=tokenizer)
en2zh = pipeline("translation_en_to_zh", model=model_back_translate, tokenizer=tokenizer_back_translate)
print("tran", zh2en(text[:5])[0]['translation_text'])
print("tran_back", en2zh(zh2en(text[:5])[0]['translation_text'], max_length=510)[0]['translation_text'])

2.逐步实现

batch = tokenizer.prepare_seq2seq_batch(src_texts=[text], return_tensors='pt', max_length=512)
# Perform the translation and decode the output
translation = model.generate(**batch)
result = tokenizer.batch_decode(translation, skip_special_tokens=True)
print("tran", result)
batch_back_translate = tokenizer_back_translate.prepare_seq2seq_batch(src_texts=result, return_tensors='pt', max_length=512)
# Perform the translation and decode the output
translation_back_translate = model_back_translate.generate(**batch_back_translate)
result = tokenizer_back_translate.batch_decode(translation_back_translate, skip_special_tokens=True)
print("tran_back", result)

clearlove100

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
翻译的两种实现方式——基于transformers

1.基于pipeline(可加参数 device,默认为-1,使用cpu，如果非负，则表示指定哪块gpu)from transformers import ( AutoTokenizer, AutoModelForSeq2SeqLM, pipeline)text = "从时间上看，中国空间站的建造比国际空间站晚20多年。"tokenizer = AutoTokenizer.from_pretrained("./Helsinki-NLP/opus-mt-z.
复制链接

扫一扫

专栏目录