MLU370-M8 快速跑通 llama3-8B

本文介绍了如何在特定平台环境下,通过部署transformers和accelerate库,下载Meta-Llama-3-8B-Instruct模型并在MLU上进行运行,展示了模型的使用和效果。
摘要由CSDN通过智能技术生成

提示:开个玩笑,下载模型花了4分30秒,30秒就跑通了这个模型,简不简单,好不好用


一、平台环境准备

目前pytorch的教程有1.9.1.13.1 2.1的这3个版本都有,如果你更换版本使用,也可参考灵活变通。

镜像选择:pytorch:v1.17_torch1.13.1_ubuntu20.04_py310

二、环境部署

1.transformers

git clone -b v4.38.2 https://githubfast.com/huggingface/transformers.git
python /torch/src/catch/tools/torch_gpu2mlu/torch_gpu2mlu.py -i transformers/
pip install -e ./transformers_mlu/

2.accelerate

git clone -b v0.27.2 https://githubfast.com/huggingface/accelerate.git
python /torch/src/catch/tools/torch_gpu2mlu/torch_gpu2mlu.py -i accelerate/
pip install -e ./accelerate_mlu/

三、模型下载

git clone https://www.modelscope.cn/LLM-Research/Meta-Llama-3-8B-Instruct.git

四、直接运行

import transformers
import torch
#模型改成你自己的路径
model_id = "/workspace/volume/gpt/zhouguojun/llama3/Meta-Llama-3-8B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.float16},
    device="mlu",
)

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = pipeline.tokenizer.apply_chat_template(
		messages, 
		tokenize=False, 
		add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])

直接run

5.效果展示

指令:You are a pirate chatbot who always responds in pirate speak!
提问:Who are you?
大模型回答:Arrrr, me hearty! Me name be Captain Chatbot, the scurviest pirate to ever sail the Seven Seas! Me be a chatbot, but don't ye worry, I be as cunning as a barnacle on a sunken ship! Me purpose be to swab the decks of yer queries and respond with answers as sharp as me trusty cutlass! So hoist the colors, me matey, and let's set sail fer a swashbucklin' good time!
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:54<00:00, 13.71s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/workspace/volume/gpt/zhouguojun/3rd/transformers4.38.2/transformers_mlu/src/transformers/pipelines/base.py:1015: UserWarning:  MLU operators don't support 64-bit calculation. so the 64 bit data will be forcibly converted to 32-bit for calculation.  (Triggered internally at /torch/catch/torch_mlu/csrc/aten/utils/tensor_util.cpp:159.)
  return inputs.to(device)
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
[2024-4-19 10:27:6] [CNNL] [Warning]:[cnnlStridedSlice] is deprecated and will be removed in the future release, please use [cnnlStridedSlice_v2] instead.
[2024-4-19 10:27:6] [CNNL] [Warning]:[cnnlRandCreateGenerator_v2] will be deprecated.
Arrrr, shiver me timbers! Me name be Captain Chatbot, the scurviest pirate to ever sail the Seven Seas! Me be a swashbucklin' chatbot, ready to engage ye in a battle o' wits and words! So hoist the colors, me hearty, and let's set sail fer a treasure trove o' conversation!
+------------------------------------------------------------------------------+
| CNMON v5.10.22                                               Driver v5.10.22 |
+-------------------------------+----------------------+-----------------------+
| Card  VF  Name       Firmware |               Bus-Id | Util        Ecc-Error |
| Fan   Temp      Pwr:Usage/Cap |         Memory-Usage | Mode     Compute-Mode |
|===============================+======================+=======================|
| 0     /   MLU370-M8    v1.1.4 |         0000:49:00.0 | 77%                 0 |
|  0%   23C         79 W/ 300 W | 17524 MiB/ 42396 MiB | FULL          Default |
+-------------------------------+----------------------+-----------------------+

+------------------------------------------------------------------------------+
| Processes:                                                                   |
|  Card  MI  PID     Command Line                             MLU Memory Usage |
|==============================================================================|
|  0     /   2798    python                                          17123 MiB |
+------------------------------------------------------------------------------+
  • 5
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值