he attention mask is not set and cannot be inferred from input because pad token is same as eos toke

使用llama3 推理时出现以下警告:

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input’s attention_mask to obtain reliable results.

这是我的原代码

from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
import torch 
model_name = "/home/base_models/llama3-70b-instruct"

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(model_name, use_safetensors=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt="Given 17 categories: 'math, health, physics, business, biology, chemistry, computer science, economics, engineering, philosophy, other, history, geography, politics, psychology, culture, law', determine the following instruction pair belongs to which category: \
  just output the final category you choose with nothing else {\
    \"messages\": [{\"role\": \"user\", \"content\": \"Choose the best answer. Which of the following is a type of data visualization? Input: A) A graph\\nB) A heatmap\\nC) A soundtrack\"}, {\"role\": \"assistant\", \"content\": \"B) A heatmap\"}\
  }"
input_ids = tokenizer.encode(prompt, return_tensors='pt') #tokenizer.encode 仅返回input_ids
generated_text = model.generate(
    input_ids=input_ids,
    max_new_tokens=512,  # 生成文本的最大长度
    num_return_sequences=1,  # 生成多少个序列
    no_repeat_ngram_size=2,  # 避免重复的n-gram
    repetition_penalty=1.5,  # 重复惩罚系数
    top_p=0.95,  # 顶端概率值
    temperature=0.7,  # 控制随机性
    do_sample=True  # 是否使用采样;False 表示使用贪婪解码
)
generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
# 解码生成的文本
output_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(output_text)

重点在 input_ids = tokenizer.encode(prompt, return_tensors=‘pt’) 这一行,使用tokenizer.encode只会返回input_ids,使用tokenizer(prompt, return_tensors=‘pt’) 则会同时返回input_ids和attention_mask。

所以改为

from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
import torch 
model_name = "/home/base_models/llama3-70b-instruct"

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(model_name, use_safetensors=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt="Given 17 categories: 'math, health, physics, business, biology, chemistry, computer science, economics, engineering, philosophy, other, history, geography, politics, psychology, culture, law', determine the following instruction pair belongs to which category: \
  just output the final category you choose with nothing else {\
    \"messages\": [{\"role\": \"user\", \"content\": \"Choose the best answer. Which of the following is a type of data visualization? Input: A) A graph\\nB) A heatmap\\nC) A soundtrack\"}, {\"role\": \"assistant\", \"content\": \"B) A heatmap\"}\
  }"
inputs = tokenizer(prompt, return_tensors='pt') #tokenizer.encode 仅返回input_ids
generated_text = model.generate(
    **inputs,
    max_new_tokens=512,  # 生成文本的最大长度
    num_return_sequences=1,  # 生成多少个序列
    no_repeat_ngram_size=2,  # 避免重复的n-gram
    repetition_penalty=1.5,  # 重复惩罚系数
    top_p=0.95,  # 顶端概率值
    temperature=0.7,  # 控制随机性
    do_sample=True  # 是否使用采样;False 表示使用贪婪解码
)
generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
# 解码生成的文本
output_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(output_text)

即可

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值