提示:开个玩笑,下载模型花了4分30秒,30秒就跑通了这个模型,简不简单,好不好用
一、平台环境准备
目前pytorch的教程有1.9.1.13.1 2.1的这3个版本都有,如果你更换版本使用,也可参考灵活变通。
镜像选择:pytorch:v1.17_torch1.13.1_ubuntu20.04_py310
二、环境部署
1.transformers
git clone -b v4.38.2 https://githubfast.com/huggingface/transformers.git
python /torch/src/catch/tools/torch_gpu2mlu/torch_gpu2mlu.py -i transformers/
pip install -e ./transformers_mlu/
2.accelerate
git clone -b v0.27.2 https://githubfast.com/huggingface/accelerate.git
python /torch/src/catch/tools/torch_gpu2mlu/torch_gpu2mlu.py -i accelerate/
pip install -e ./accelerate_mlu/
三、模型下载
git clone https://www.modelscope.cn/LLM-Research/Meta-Llama-3-8B-Instruct.git
四、直接运行
import transformers
import torch
#模型改成你自己的路径
model_id = "/workspace/volume/gpt/zhouguojun/llama3/Meta-Llama-3-8B-Instruct"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.float16},
device="mlu",
)
messages = [
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
{"role": "user", "content": "Who are you?"},
]
prompt = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipeline(
prompt,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])
直接run
5.效果展示
指令:You are a pirate chatbot who always responds in pirate speak!
提问:Who are you?
大模型回答:Arrrr, me hearty! Me name be Captain Chatbot, the scurviest pirate to ever sail the Seven Seas! Me be a chatbot, but don't ye worry, I be as cunning as a barnacle on a sunken ship! Me purpose be to swab the decks of yer queries and respond with answers as sharp as me trusty cutlass! So hoist the colors, me matey, and let's set sail fer a swashbucklin' good time!
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:54<00:00, 13.71s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/workspace/volume/gpt/zhouguojun/3rd/transformers4.38.2/transformers_mlu/src/transformers/pipelines/base.py:1015: UserWarning: MLU operators don't support 64-bit calculation. so the 64 bit data will be forcibly converted to 32-bit for calculation. (Triggered internally at /torch/catch/torch_mlu/csrc/aten/utils/tensor_util.cpp:159.)
return inputs.to(device)
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
[2024-4-19 10:27:6] [CNNL] [Warning]:[cnnlStridedSlice] is deprecated and will be removed in the future release, please use [cnnlStridedSlice_v2] instead.
[2024-4-19 10:27:6] [CNNL] [Warning]:[cnnlRandCreateGenerator_v2] will be deprecated.
Arrrr, shiver me timbers! Me name be Captain Chatbot, the scurviest pirate to ever sail the Seven Seas! Me be a swashbucklin' chatbot, ready to engage ye in a battle o' wits and words! So hoist the colors, me hearty, and let's set sail fer a treasure trove o' conversation!
+------------------------------------------------------------------------------+
| CNMON v5.10.22 Driver v5.10.22 |
+-------------------------------+----------------------+-----------------------+
| Card VF Name Firmware | Bus-Id | Util Ecc-Error |
| Fan Temp Pwr:Usage/Cap | Memory-Usage | Mode Compute-Mode |
|===============================+======================+=======================|
| 0 / MLU370-M8 v1.1.4 | 0000:49:00.0 | 77% 0 |
| 0% 23C 79 W/ 300 W | 17524 MiB/ 42396 MiB | FULL Default |
+-------------------------------+----------------------+-----------------------+
+------------------------------------------------------------------------------+
| Processes: |
| Card MI PID Command Line MLU Memory Usage |
|==============================================================================|
| 0 / 2798 python 17123 MiB |
+------------------------------------------------------------------------------+