【大模型】return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) TypeError: not a string

【大模型】return _sentencepiece.SentencePieceProcessor_LoadFromFileself, arg TypeError: not a string

错误信息

运行大模型 Qwen1.5-14B-Chat 出现如下错误:

Traceback (most recent call last):
  File "/abc/llm/./transformers_low_bit_pipeline.py", line 44, in <module>
    tokenizer = LlamaTokenizer.from_pretrained(model_path, trust_remote_code=True)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/abc/.conda/envs/llm/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2048, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/abc/.conda/envs/llm/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2287, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/abc/.conda/envs/llm/lib/python3.11/site-packages/transformers/models/llama/tokenization_llama.py", line 182, in __init__
    self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/abc/.conda/envs/llm/lib/python3.11/site-packages/transformers/models/llama/tokenization_llama.py", line 209, in get_spm_processor
    tokenizer.Load(self.vocab_file)
  File "/abc/.conda/envs/llm/lib/python3.11/site-packages/sentencepiece/__init__.py", line 961, in Load
    return self.LoadFromFile(model_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/abc/.conda/envs/llm/lib/python3.11/site-packages/sentencepiece/__init__.py", line 316, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: not a string

重点信息

这行代码:

    tokenizer = LlamaTokenizer.from_pretrained(model_path, trust_remote_code=True)

报错:

  File "/abc/.conda/envs/llm/lib/python3.11/site-packages/sentencepiece/__init__.py", line 316, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: not a string

解决方法

代码:

    from transformers import LlamaTokenizer, TextGenerationPipeline

    ...
    
    tokenizer = LlamaTokenizer.from_pretrained(model_path, trust_remote_code=True)
    

修改为:

    from transformers import LlamaTokenizer, TextGenerationPipeline, AutoTokenizer

    ...
    
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

验证

再次运行模型,正常!!

  • 3
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
python web_demo.py Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "/home/nano/THUDM/ChatGLM-6B/web_demo.py", line 5, in <module> tokenizer = AutoTokenizer.from_pretrained("/home/nano/THUDM/chatglm-6b", trust_remote_code=True) File "/home/nano/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/home/nano/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained return cls._from_pretrained( File "/home/nano/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 221, in __init__ self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 64, in __init__ self.text_tokenizer = TextTokenizer(vocab_file) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 22, in __init__ self.sp.Load(model_path) File "/home/nano/.local/lib/python3.10/site-packages/sentencepiece/__init__.py", line 905, in Load return self.LoadFromFile(model_file) File "/home/nano/.local/lib/python3.10/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]什么错误
07-22

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

szZack

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值