llama.cpp 安装和配置指南

杭聪帆Ambitious

于 2024-09-13 21:39:33 发布

阅读量216

点赞数 1

本文链接：https://blog.csdn.net/gitblog_09336/article/details/142222438

版权

llama.cpp 是一个开源的 C/C++ 库，旨在通过最小的设置和最先进的性能，在各种硬件上实现大型语言模型（LLM）的推理。该项目支持多种硬件加速后端，包括 Apple Silicon、x86 架构的 AVX、AVX2 和 AVX512，以及 NVIDIA 和 AMD 的 GPU。

该项目主要使用 C 和 C++ 语言编写。

首先，使用 Git 克隆 llama.cpp 仓库到本地：

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp

使用 CMake 配置和构建项目：

mkdir build
cd build
cmake ..
make

构建完成后，可以运行示例程序来验证安装是否成功：

./llama-cli -m models/llama-13b-v2/ggml-model-q4_0.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e I

如果需要使用 Python 绑定，可以按照以下步骤进行配置：

pip install llama-cpp-python

export CMAKE_ARGS="-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS"
pip install llama-cpp-python

通过以上步骤，您已经成功安装并配置了 llama.cpp 项目。您现在可以在本地运行大型语言模型的推理，并根据需要进行进一步的优化和扩展。

关注