初始大模型之helloworld编写
开发环境:modelscope GPU版本上测试的,GPU免费36小时
ps:可以不用conda直接用环境自带的python环境使用
- 安装conda
1.2 bash Miniconda3-latest-Linux-x86_64.sh 需要多次确认 按enter或者输入yes no change /root/miniconda3/condabin/conda no change /root/miniconda3/bin/conda no change /root/miniconda3/bin/conda-env no change /root/miniconda3/bin/activate no change /root/miniconda3/bin/deactivate no change /root/miniconda3/etc/profile.d/conda.sh no change /root/miniconda3/etc/fish/conf.d/conda.fish no change /root/miniconda3/shell/condabin/Conda.psm1 no change /root/miniconda3/shell/condabin/conda-hook.ps1 no change /root/miniconda3/lib/python3.12/site-packages/xontrib/conda.xsh no change /root/miniconda3/etc/profile.d/conda.csh 安装成功后设置环境变量 1.3 source ~/.bashrc 最后查看是否安装成功 Thank you for installing Miniconda3! root@dsw-637832-67644985ff-l2r6g:/mnt/workspace# source ~/.bashrc (base) root@dsw-637832-67644985ff-l2r6g:/mnt/workspace# conda --version conda 24.7.1 1.4 创建conda虚拟环境 conda create -n myenv python=3.10 -y 1.5 激活环境 conda activate myenv 1.6 查看生效版本 conda env list 1.7 pip list 查看安装的依赖包 1.8 添加清华源 conda config --add channels Index of /anaconda/pkgs/main/ | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror |
- 基于conda,安装依赖
- 新建文件requirements.txt,内容如下
transformers>=4.32.0,<4.38.0 accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy vllm |
- pip install -r requirements.txt
或者逐个安装
- 下载大模型
modelscope download --model qwen/Qwen-7B-Chat
有多种方式可以下载
这种方式下载到了/home/admin/workspace/.cache/modelscope/hub/qwen/Qwen-7B-Chat ,可以通过find / -name Qwen-7B-Chat 找到位置
其他下载方式
通过pip list查看所有已经安装的依赖
4. 输入命令 python进入python脚本开发
from modelscope import AutoModelForCausalLM, AutoTokenizer from modelscope import GenerationConfig #这里可以写下载好的路径 tokenizer = AutoTokenizer.from_pretrained("./qwen/Qwen-7B-Chat", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("./qwen/Qwen-7B-Chat", device_map="auto", trust_remote_code=True).eval() response, history = model.chat(tokenizer, "你好", history=None) print(response) #可以开始不停的输入问题了,要想记住会话,就写history=history response, history = model.chat(tokenizer, "你好", history=None) print(response) #history不能无限次的计住的,达到一定限制就会删除部分历史了 response, history = model.chat(tokenizer, "如果借款10W,年利息3点,算复利,3年本息共计多少", history=None) print(response) |
总结:这几步骤是运行大模型通用且必须的
from modelscope import AutoModelForCausalLM, AutoTokenizer
from modelscope import GenerationConfig
#这里可以写下载好的路径
tokenizer = AutoTokenizer.from_pretrained("./qwen/Qwen-7B-Chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("./qwen/Qwen-7B-Chat", device_map="auto", trust_remote_code=True).eval()
推理
from modelscope import AutoModelForCausalLM, AutoTokenizer
from modelscope import GenerationConfig
#这里可以写下载好的路径
tokenizer = AutoTokenizer.from_pretrained("./qwen/Qwen-7B-Chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("./qwen/Qwen-7B-Chat", device_map="auto", trust_remote_code=True).eval()
device = "cuda"
prefix = "北京是中国的首都"
model_inputs = tokenizer([prefix], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=400,
repetition_penalty=1.15
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)