T5相关模型用法
T5Tokenizer
- 模型加载
tokenizer = T5Tokenizer.from_pretrained(model_params[“MODEL”]) - encode
source = self.tokenizer.batch_encode_plus(
[source_text],
max_length=self.source_len,
pad_to_max_length=True,
truncation=True,
padding="max_length",
return_tensors="pt",
)
source_ids = source["input_ids"].squeeze()
source_mask = source["attention_mask"].squeeze()
- decode
tokenizer.decode(g, skip_special_tokens=True,