llama.cpp:llama模型c语言推理@FreeBSD

llama中文名羊驼,Meta AI推出的一款大型语言模型,其性能在多个自然语言处理任务上表现优异是一个非常棒的自然语言生成模型。

llama.cpp是一个使用c语言推理llama的软件包,它支持FreeBSD、Linux等多种平台。

GitHub - ggerganov/llama.cpp: LLM inference in C/C++

源码编译安装 

下载源码

git clone https://github.com/ggerganov/llama.cpp

编译

mkdir build
cd build
cmake ..
cmake --build . --config Release

大约只需要10-20分钟就能编译好,速度很快!

FreeBSD里默认是没有sudo的,用root账户将编译好的文件放入/usr/bin又担心干干扰,所以使用加环境变量的方法解决安装问题。

创建env.sh文件,文件内容:

export PATH=/home/skywalk/github/llama.cpp/build/bin:$PAT

每次使用前执行source env.sh即可。

之所以不放入.cshrc或者.bashrc,也是不想让它影响整个系统。毕竟后面可能安装其它模型。

下载模型文件

llama中文模型

官网:GitHub - ymcui/Chinese-LLaMA-Alpaca-2: 中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

可以通过百度和谷歌网盘下载模型文件

下载地址:

以下是完整版模型,直接下载即可使用,无需其他合并步骤。推荐网络带宽充足的用户。

模型名称类型大小下载地址GGUF
Chinese-LLaMA-2-13B基座模型24.7 GB[百度] [Google] [🤗HF][🤗HF]
Chinese-LLaMA-2-7B基座模型12.9 GB[百度] [Google] [🤗HF][🤗HF]
Chinese-LLaMA-2-1.3B基座模型2.4 GB[百度] [Google] [🤗HF][🤗HF]
Chinese-Alpaca-2-13B指令模型24.7 GB[百度] [Google] [🤗HF][🤗HF]
Chinese-Alpaca-2-7B指令模型12.9 GB[百度] [Google] [🤗HF][🤗HF]
Chinese-Alpaca-2-1.3B指令模型2.4 GB[百度] [Google][🤗HF][🤗HF]

ps,huggingface的模型可以去镜像网站看看:HF-Mirror - Huggingface 镜像站

FreeBSD下pkg安装

使用命令pkg install llama-cpp

pkg install llama-cpp 
Updating FreeBSD repository catalogue...
pkg: No SRV record found for the repo 'FreeBSD'
Fetching meta.conf:   0%
FreeBSD repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
	llama-cpp: 3285

Number of packages to be installed: 1

The process will require 22 MiB more space.
3 MiB to be downloaded.

Proceed with this action? [y/N]: y
[1/1] Fetching llama-cpp-3285.pkg: 100%    3 MiB   3.1MB/s    00:01    
Checking integrity... done (0 conflicting)
[1/1] Installing llama-cpp-3285...

pkg安装好后,执行文件放在/usr/local目录里:

root@fbhost:/usr/local/bin #  ls llama*
llama-baby-llama		llama-llava-cli
llama-batched			llama-lookahead
llama-batched-bench		llama-lookup
llama-bench			llama-lookup-create
llama-bench-matmult		llama-lookup-merge
llama-cli			llama-lookup-stats
llama-convert-llama2c-to-ggml	llama-parallel
llama-cvector-generator		llama-passkey
llama-embedding			llama-perplexity
llama-eval-callback		llama-quantize
llama-export-lora		llama-quantize-stats
llama-finetune			llama-retrieval
llama-gbnf-validator		llama-save-load-state
llama-gguf			llama-server
llama-gguf-split		llama-simple
llama-gritlm			llama-speculative
llama-imatrix			llama-tokenize
llama-infill			llama-train-text-from-scratch

 

测试Chinese-Alpaca-2-1.3B模型

这个模型小一些,从百度网盘下载还方便一点。

将模型下载到本地后,是这些文件:

ls -l ~/work/model/chinesellama/
total 4935424
-rw-r--r--  1 skywalk  skywalk      339595  3月 24 20:30 chinesellama.tar.gz
-rw-r--r--  1 skywalk  skywalk         671  3月 24 20:06 config.json
-rw-r--r--  1 skywalk  skywalk         170  3月 24 20:06 generation_config.json
-rw-r--r--  1 skywalk  skywalk  2525058738  3月 24 21:01 pytorch_model.bin
-rw-r--r--  1 skywalk  skywalk         435  3月 24 20:08 special_tokens_map.json
-rw-r--r--  1 skywalk  skywalk         766  3月 24 20:08 tokenizer_config.json
-rw-r--r--  1 skywalk  skywalk      844403  3月 24 20:08 tokenizer.model

转换模型

python convert.py ~/work/model/chinesellama/

模型写入:Wrote /home/skywalk/work/model/chinesellama/ggml-model-f16.gguf

执行

main -m ~/work/model/chinesellama/ggml-model-f16.gguf  -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e

我的8G内存本本直接崩了,意料之中。

不过整个流程算是跑通了!

使用llama.cpp,可以在FreeBSD下跑llama中文模型,太棒了!

附赠llama2.c

llama2.c 用纯c 700行代码推理llama2 模型!关键还是跨平台的,FreeBSD下一样好使!

官网:https://github.com/karpathy/llama2.c

下载代码:

git clone https://github.com/karpathy/llama2.c

进入项目目录并下载模型:

cd llama2.c
wget https://karpathy.ai/llama2c/model.bin -P out

这个模型是stories15M模型

编译并运行:

gcc -O3 -o run run.c -lm
./run out/model.bin

英文小说输出效果:

Once upon a time, there was a big bookcase in a little girl's room. The little girl, named Lucy, loved to read. She would sit on the chair and read all day. One day, Lucy saw a scary monster in her room. The monster had big teeth and big eyes. Lucy was scared, but she wanted to find out who was scary.
Lucy thought and thought. Then, she had an idea. She would change her clothes and draw a face on the monster with a big crayon. The monster thought it was a funny picture. Lucy went back to her room and started to draw on the bookcase.
As Lucy drew, the monster from the book came to life! It was a funny looking monster that looked at Lucy's drawings. Lucy was not scared anymore. She laughed and played with her new friend. The monster and Lucy were happy friends forever.
achieved tok/s: 44.499106

可以下载更大的模型,比如stories110M.bin:

https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.bin

具体模型如下:

modeldimn_layersn_headsn_kv_headsmax context lengthparametersval lossdownload
260K64584512260K1.297stories260K
OG28866625615M1.072stories15M.bin
42M512888102442M0.847stories42M.bin
110M7681212121024110M0.760stories110M.bin

理论上,可以推理任何的llama模型,不过作者说因为是float32推理,所以大于7b的不建议。

总结

llama.cpp和llama2.c能在FreeBSD平台进行AI推理,真的是太棒了!

调试

直接在根目录make编译报错

make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 627: Unknown modifier " For CUDA versions < 11.7 a target CUDA architecture must be explicitly provided via CUDA_DOCKER_ARCH"
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 627: Invalid line type
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 628: Invalid line type
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 629: Invalid line type
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 630: Invalid line type
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 631: Invalid line type
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 632: Invalid line type
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 833: Invalid line type
make: "/usr/home/skywalk/github/llama.cpp/Makefile" line 836: Invalid line type
make: Fatal errors encountered -- cannot continue

改成cmake

mkdir build
cd build
cmake ..
cmake --build . --config Release

  • 15
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值