一文带你了解AI定制化Agent

AI定制化

一、Windows

1.环境条件

(1).PC Windows 10

(2).CPU AMD5 7500f (6核心12线程)

(3).GPU RTX 4070 Super(任务管理器界面:12G专用显存+15.8G共享显存=27G显存)

(4).内存条CL30 DDR5 6000 16G*2

(5).安装VMware Workstation Pro+CentOS7镜像

(6).安装CUDA驱动、CUDA ToolKit及cuDNN

(7).安装Conda+PyCharm(使用3.11版本python)

(8).安装Java JDK(17/21)+Maven3.9.9+IDEA【可选:只是为了提供功能接口(api)】

(9).安装Mysql8(初始化用户名密码:root/123456)

(10).pip安装jupyter

(11).安装Visual Studio Code+NVM(多nodeJS版本管控工具)

(Critical).using VPN meanwhile

使用Conda安装python3.11环境

conda create --name ai_fine_tuning_3_11 python=3.11
activate ai_fine_tuning_3_11
#conda env remove --name ai_fine_tuning_3_11

显卡驱动CUDA确认

在这里插入图片描述根据cuda12.6找相同版本12.6的CUDA ToolKit进行下载安装:

https://developer.nvidia.com/cuda-downloads
# nvcc -V  用于检查安装情况
# set cuda  检查环境变量
# cuda_12.6.3_561.17_windows.exe早于Visual Studio 2022安装时,不会自动为后者追加拓展文件。

安装cuDNN

https://developer.nvidia.com/cudnn

根据cuda12.6,安装torch相近的12.4版本

# https://pytorch.org/get-started/locally/#windows-prerequisites-2
conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia

下载模型Qwen2.5-7B

git clone https://www.modelscope.cn/Qwen/Qwen2.5-7B-Instruct.git

下载模型训练工具LLama-Factory,然后使用pycharm+python3.11打开,进入项目目录下执行如下命令:

# download it.zip from https://github.com/hiyouga/LLaMA-Factory.git
pip install -e ".[torch,metrics]"
pip install -r requirements.txt
pip install modelscope -U
pip install matplotlib -U
pip install hqq -U
pip install bitsandbytes -U
DS_BUILD_CPU_ADAM=1 pip install deepspeed

下载CMake,解压后将其bin目录加至Path环境变量下:cmake-3.31.1-windows-x86_64.zip

https://cmake.org/download/

下载Visual Studio(Community Edition),运行后选择桌面c++板块内容安装,若中途取消安装了,可以在开始菜单下栏的新增项目(或推荐项目)中找到该安装程序。这里只需要c++的编译环境,把除了C++板块以外的安装项目全部取消勾选(该板块自动勾选系统SDK等其他相关组件,不要取消勾选)

https://visualstudio.microsoft.com/zh-hans/vs/

下载safetensor量化工具llama.cpp

# download it.zip from https://github.com/ggerganov/llama.cpp.git
# 参照`有用的东西>私有模型部署到Ollama并推理测试(Win10)`中的方法编译‘适用于gguf的’量化工具,至于safetensors文件转gguf文件的convert.py及其requirements.txt后续补充

下载jupyterNote,用于网页开发

# INSTALL
activate ai_fine_tuning_3_11
pip install jupyter -U
# RUN
jupyter notebook

2.运行检测

(1)GPU

在LLaMA-Factory工程主目录下新建test.py,键入并运行如下代码

#!/usr/bin/python
# -*- coding: utf-8 -*-

import torch

if __name__ == '__main__':
    print(torch.cuda.current_device())
    print(torch.cuda.get_device_name(0))
    print(torch.__version__)

处理py运行时依赖报错,找不到dll

# 需要下载依赖分析工具: https://github.com/lucasg/Dependencies/tree/v1.11.1
# 解压Dependencies_x64_Release.zip,点击DependenciesGui.exe
# 输入报错的目录,如:C:\Users\admin\anaconda3\envs\ai_fine_tuning_3_11\Lib\site-packages\torch\lib\

找到真正缺少的dll文件,后根据如下步骤下载

报错如:OSError: [WinError 126] 找不到指定的模块。 Error loading "C:\Users\admin\anaconda3\envs\ai_fine_tuning_3_11\Lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies.

发现实际缺失libomp140.x86_64.dll

win下载DLL:
https://www.dll-files.com/download/fdf9273b477d0107bc8f8c4bf4311173/mfc100u.dll.html?c=UVR6TnFWcDVOSCtCSm9VaDg1dzZaZz09

搜索下载对应32bit的dll后,如mfc100.dll,将文件放至C:\Windows\System32

然后点击Win+R,输入“regsvr32 mfc100.dll”将DLL注册到系统中

正常运行为如下结果

0
NVIDIA GeForce RTX 4070 SUPER
2.5.1
(2)Model

根据官方模型Qwen2.5-7B仓库中的readme文件进行测试

from modelscope import AutoModelForCausalLM, AutoTokenizer

# model_name = "qwen/Qwen2.5-7B-Instruct"
model_name = "C:\\Users\\admin\Desktop\\ai_material\\Qwen2___5-7B-Instruct"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {
   
   "role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    {
   
   "role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

运行效果如下

Loading checkpoint shards: 100%|██████████| 4/4 [00:06<00:00,  1.63s/it]
Some parameters are on the meta device because they were offloaded to the cpu.
A large language model (LLM) is a type of artificial intelligence model designed to understand and generate human-like text based on the input it receives. These models are typically trained on vast amounts of textual data from the internet, books, articles, and other sources, allowing them to learn complex patterns in language and context.

LLMs can perform a wide range of natural language processing tasks, including but not limited to:

1. **Text Generation**: Creating coherent paragraphs or entire documents.
2. **Translation**: Converting text from one language to another.
3. **Summarization**: Condensing long texts into shorter summaries.
4. **Question Answering**: Providing answers to questions based on given information.
5. **Dialogue Systems**: Engaging in conversations with users, understanding their queries, and responding appropriately.

These models are usually deep learning architectures, such as transformers, which allow them to process and understand the context of words within sentences and longer passages of text. The size of these models refers to the number of parameters they contain, with larger models generally having more parameters and thus being able to capture more nuanced and sophisticated language patterns.

Examples of large language models include those developed by companies like Google (e.g., BERT, T5), Microsoft (e.g., Turing-NLG), and Alibaba Cloud (e.g., Qwen).

Process finished with exit code 0
(3)LLama-Factory WebUI
cd C:\Users\admin\Desktop\ai_material\LLaMA-Factory-main
conda activate ai_fine_tuning_3_11
python src/webui.py

在这里插入图片描述

运行效果应如下,模型的位置(如右上)需要手动选择

在这里插入图片描述

根据官网教程,使用当前设备满足微调条件的模型进行微调

检查点路径用于配置微调过程中的参数对象副本,方便基于先前预训练进度二次训练

动作参数枚举 参数说明
version 显示版本信息
train 命令行版本训练
chat 命令行版本推理chat
export 模型合并和导出
api 启动API server,供接口调用
eval 使用mmlu等标准数据集做评测
webchat 前端版本纯推理的chat页面
webui 启动LlamaBoard前端页面,包含可视化训练,预测,chat,模型合并多个子页面

数据集格式(传送门):目前仅支持 alpaca 格式和 sharegpt 格式的数据集

3.知识库搭建

这里使用AnythingLLM传送门

安装过程需要科学上网,额外超过2GB的资源。

安装完成之后,点击按钮 Get started 进入设置向导界面。

(连接上本地使用ollama启动的模型, token 现设4096,一般来说越大会越准确但解析等待时间越长)

  • Embedding Preference(嵌入模型)的选择,选择默认的 AnythingLLM Embedder 。

  • 向量数据库 Vector Database Connection,选择默认的 LanceDB。

确认相关信息之后,制定工作空间名称(如test和ttt),并使用网页链接为模型增加知识

在这里插入图片描述

网页源知识内容如下

在这里插入图片描述

创建新聊天窗,测试查询知识

在这里插入图片描述

向外部提供api查询知识库(如果开启了验证,需要在api_docs右上角使用PRIVATE_KEY登录)

在这里插入图片描述

当然这些是最简单的本地知识库搭建,除此之外 AnythingLLM 还提供了灵活的配置供给管理,可以设置例如语料分割参数、修改 Chat mode、修改 Embedding chunk 的大小(用于设置向量模型处理资料的颗粒度)等等。

4.AI Workflow讲解

Langchain的核心是“链”的概念,这是一个构建块,允许您组合和编排不同的组件,以创建复杂而智能的应用程序。想象一下,您是一名数据科学家,正在从事一个尖端项目,该项目涉及处理和分析大量非结构化数据,例如客户评论、社交媒体帖子,甚至是学术论文。您的目标是从这些数据中提取见解和有价值的信息,但任务的庞大数量和复杂性可能令人生畏。使用LangChain链,您可以将这个非常复杂的任务分解成更小的、可管理的部分,然后将它们链接在一起,以创建一个无缝的端到端解决方案。这就像拥有一支由高技能助手组成的团队,每个人都专注于一项特定的任务,而您正在协调他们的努力,以构建真正非凡的东西。
原文链接:https://blog.csdn.net/qkh1234567/article/details/140371297

(5-1)LLM Chain

在此示例中,我们首先从LangChain导入必要的导入。然后,我们初始化一个 OpenAI 语言模型,并创建一个提示模板,要求提供描述给定产品的最佳名称。接下来,我们将语言模型和提示模板组合成一个 LLMChain。现在,可以使用任何产品描述调用此链,并且它将根据输入生成合适的公司名称。例如,如果我们输入“Queen Size Sheet Set”作为产品,连锁店可能会输出“Royal Slumber Bedding Co.”作为建议的公司名称。很简单,对吧?但不要被它的简单性所迷惑——LLMChain 是一个有用的工具,可用于广泛的应用程序,从内容生成到数据分析,甚至代码生成。

from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain

# Initialize the language model
llm_model = "gpt-3.5-turbo"
llm = ChatOpenAI(temperature=0.9, model=llm_model)

# Create a prompt template
prompt = ChatPromptTemplate.from_template("What is the best name to describe a company that makes {product}?")

# Combine the LLM and prompt into an LLMChain
chain = LLMChain(llm=llm, prompt=prompt)

# Run the chain on some input data
product = "Queen Size Sheet Set"
chain_output = chain.invoke({
   
   product})
print(chain_output)

# Output 
# {'product': {'Queen Size Sheet Set'}, 'text': 'Royal Slumber Bedding Co.'}
(5-2)顺序链

LLMChain 是一个很好的起点,但有时我们的任务需要按特定顺序执行一系列步骤或操作。这就是顺序链发挥作用的地方,它允许我们将多个提示链接在一起,以创建更复杂和更精密的工作流程。

(5-2-1)SimpleSequentialChain

从 SimpleSequentialChain 开始,它非常适合链中每个步骤都有单个输入和单个输出的方案。想象一下,您想要创建一个系统,该系统不仅可以根据产品建议公司名称,还可以为该公司生成简短描述。

from langchain.chains import SimpleSequentialChain

# Initialize the language model
llm = ChatOpenAI(temperature=0.9, model=llm_model)

# Prompt template 1: Suggest a company name
first_prompt = ChatPromptTemplate.from_template("What is the best name to describe a company that makes {product}?")
chain_one = LLMChain(llm=llm, prompt=first_prompt)

# Prompt template 2: Generate a company description
second_prompt = ChatPromptTemplate.from_template("Write a 20-word description for the following company: {company_name}")
chain_two = LLMChain(llm=llm, prompt=second_prompt)

# Create the SimpleSequentialChain
overall_simple_chain = SimpleSequentialChain(chains=[chain_one, chain_two], verbose=True)

# Run the chain on some input data
product = "Queen Size Sheet Set"
chain_output = overall_simple_chain.invoke(product)
print(chain_output)

# Output 
# {'input': 'Queen Size Sheet Set',
# 'output': 'Regal Comfort Linens provides luxurious and stylish bedding options to ensure a comfortable and elegant sleep experience for customers.'}

在此示例中,我们首先定义两个单独的 LLMChains:一个用于根据产品建议公司名称,另一个用于生成给定公司名称的简短描述。然后,我们将这两条链组合成一个 SimpleSequentialChain,指定它们的执行顺序。当我们使用“Queen Size Sheet Set”等产品调用此链时,它将首先生成一个公司名称(例如,“Royal Comfort Linens”),然后将该名称用作第二条链的输入,该链将输出类似“Regal Comfort Linens 提供豪华时尚的床上用品选择,以确保为客户提供舒适优雅的睡眠体验”。SimpleSequentialChain 的美妙之处在于它能够将复杂的任务分解为更小的、可管理的步骤,每个步骤都有明确定义的输入和输出。这种模块化方法不仅使代码更具可读性和可维护性,而且还允许随着项目的发展而获得更大的灵活性和可扩展性。

(5-2-2)SequentialChain

虽然 SimpleSequentialChain 非常适合简单的任务,但有时我们的链需要同时处理多个输入和输出。进入 SequentialChain,这是其更简单版本的更强大、更灵活的版本。

from langchain.chains import SequentialChain

# Initialize the language model
llm = ChatOpenAI(temperature=0.9, model=llm_model)

# Prompt template 1: Translate review to English
first_prompt = ChatPromptTemplate.from_template("Translate the following review to English:\n\n{Review}")
chain_one = LLMChain(llm=llm, prompt=first_prompt, output_key="English_Review")

# Prompt template 2: Summarize the review in one sentence
second_prompt = ChatPromptTemplate.from_template("Can you summarize the following review in 1 sentence:\n\n{English_Review}")
chain_two = LLMChain(llm=llm, prompt=second_prompt, output_key="summary")

# Prompt template 3: Detect the language of the review
third_prompt = ChatPromptTemplate.from_template("What language is the following review:\n\n{Review}")
chain_three = LLMChain(llm=llm, prompt=third_prompt, output_key="language")

# Prompt template 4: Generate a follow-up response
fourth_prompt = ChatPromptTemplate.from_template("Write a follow-up response to the following summary in the specified language:\n\nSummary: {summary}\n\nLanguage: {language}")
chain_four = LLMChain(llm=llm, prompt=fourth_prompt, output_key="followup_message")

# Create the SequentialChain
overall_chain = SequentialChain(
    chains=[chain_one, chain_two, chain_three, chain_four],
    input_variables=["Review"],
    output_variables=["English_Review", "summary", "followup_message"],
    verbose=True,
)

# Run the chain on some input data
review = "Je trouve le goût médiocre. La mousse ne tient pas, c'est bizarre. J'achète les mêmes dans le commerce et le goût est bien meilleur...\nVieux lot ou contrefaçon !?"
chain_output = overall_chain.invoke(review)
print(chain_output)

# Output
# Entering new SequentialChain chain...
# Finished chain.
# {'English_Review': "I find the taste mediocre. The foam doesn't hold, it's "
#                    'weird. I buy the same ones in stores and the taste is much '
#                    'better... \n'
#                    'Old batch or counterfeit!?',
#  'Review': "Je trouve le goût médiocre. La mousse ne tient pas, c'est bizarre. "
#            "J'achète les mêmes dans le commerce et le goût est bien "
#            'meilleur...\n'
#            'Vieux lot ou contrefaçon !?',
#  'followup_message': "Je suis désolé(e) d'apprendre que vous avez trouvé le "
#                      "goût du produit médiocre. Il est possible qu'il s'agisse "
#                      "d'un lot ancien ou contrefait, comme vous l'avez "
#                      "suggéré. Il est important de s'assurer de la qualité des "
#                      'produits que nous consommons. Avez-vous envisagé de '
#                      'contacter le fabricant pour clarifier la situation ? '
#                      "J'espère que votre prochaine expérience d'achat sera "
#                      'plus satisfaisante. Merci de partager votre avis.',
#  'summary': 'The reviewer found the taste of the product mediocre and '
#             'different from what they usually buy in stores, suggesting that '
#             'it may be an old batch or counterfeit.'}

在这个更高级的示例中,我们定义了四个单独的 LLMChains,每个 LLMChain 都有自己的特定任务:

1.将商品评论从原文翻译成英文。
2.用一句话总结翻译后的评论。
3.检测评论的原始语言。
4.以检测到的语言生成对摘要的后续响应。
这里的关键区别在于,每条链可以有多个输入和输出变量,我们需要使用 output_key 和 input_variables/output_variables 参数显式指定这些变量。例如,第一条链将原始评论作为输入并输出English_Review。然后,第二条链将English_Review作为输入,并输出一句话的摘要。第三条链使用原始评论来检测语言,最后,第四条链将摘要和语言结合起来生成followup_message。然后,我们将这四条链组合成一个 SequentialChain,指定它们应该执行的顺序,以及整个链的输入和输出变量。当我们用产品评论来调用这个链条时,比如“Je trouve le goût médiocre.La mousse ne tient pas, c’est bizarre.J’achète les mêmes dans le commerce et le goût est bien meilleur…nVieux lot ou contrefaçon !?“,它将经历链条中的每一步,将评论翻译成英语,对其进行总结,检测原始语言(在本例中为法语),最后根据摘要生成法语的后续回复。SequentialChain 的有用之处在于它能够处理具有多个输入和输出的复杂工作流程,允许您将最复杂的任务分解为更小、可管理的组件。

(5-3)路由链

有时,我们的任务需要不同的方法或专门的子链,具体取决于输入数据。这就是路由器链发挥作用的地方,它允许我们根据某些标准动态地将输入路由到适当的子链。

(5-3-1)MultiPromptChain

路由器链的一个常见用例是当我们有多个提示时,每个提示专门用于特定类型的输入或任务。MultiPromptChain 允许我们定义这些专用提示,然后根据输入内容将输入动态路由到适当的提示。

from langchain.chains.router import MultiPromptChain
from langchain.chains.router.llm_router import LLMRouterChain, RouterOutputParser
from langchain.prompts import PromptTemplate

# Define specialized prompt templates
physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise\
and easy to understand manner. \
When you don't know the answer to a question you admit\
that you don't know.

Here is a question:
{input}"""


math_template = """You are a very good mathematician. \
You are great at answering math questions. \
You are so good because you are able to break down \
hard problems into their component parts, 
answer the component parts, and then put them together\
to answer the broader question.

Here is a question:
{input}"""

history_template = """You are a very good historian. \
You have an excellent knowledge of and understanding of people,\
events and contexts from a range of historical periods. \
You have the ability to think, reflect, debate, discuss and \
evaluate the past. You have a respect for historical evidence\
and the ability to make use of it to support your explanations \
and judgements.

Here is a question:
{input}"""


computerscience_template = """ You are a successful computer scientist.\
You have a passion for creativity, collaboration,\
forward-thinking, confidence, strong problem-solving capabilities,\
understanding of theories and algorithms, and excellent communication \
skills. You are great at answering coding questions. \
You are so good because you know how to solve a problem by \
describing the solution in imperative steps \
that a machine can easily interpret and you know how to \
choose a solution that has a good balance between \
time complexity and space complexity. 

Here is a question:
{input}"""

# Create prompt info dictionaries
prompt_infos = [
    {
   
   
   
   
        "name": "physics",
        "description": "Good for answering questions about physics",
        "prompt_template": physics_template,
    },
    {
   
   
   
   
        "name": "math",
        "description": "Good for answering math questions",
        "prompt_template": math_template,
    },
    {
   
   
   
   
        "name": "History",
        "description": "Good for answering history questions",
        "prompt_template": history_template,
    },
    {
   
   
   
   
        "name": "computer science",
        "description": "Good for answering computer science questions",
        "prompt_template": computerscience_template,
    },
]

# Initialize the language model
llm = ChatOpenAI(temperature=0, model=llm_model)

# Create destination chains (LLMChains) for each prompt
destination_chains = {
   
   
   
   }
for p_info in prompt_infos:
    name = p_info["name"]
    prompt_template = p_info["prompt_template"]
    prompt = ChatPromptTemplate.from_template(template=prompt_template)
    chain = LLMChain(llm=llm, prompt=prompt)
    destination_chains[name] = chain

# Define a default chain for inputs that don't match any specialized prompt
default_prompt = ChatPromptTemplate.from_template("{input}")
default_chain = LLMChain(llm=llm, prompt=default_prompt)

MULTI_PROMPT_ROUTER_TEMPLATE = """Given a raw text input to a \
language model select the model prompt best suited for the input. \
You will be given the names of the available prompts and a \
description of what the prompt is best suited for. \
You may also revise the original input if you think that revising\
it will ultimately lead to a better response from the language model.

<< FORMATTING >>
Return a string snippet enclosed by triple backticks a JSON object formatted to look like below:
{
    "destination": string \ name of the prompt to use or "default"
    "next_inputs": string \ a potentially modified version of the original input
}

REMEMBER: "destination" MUST be one of the candidate prompt \
names specified below OR it can be "default" if the input is not \
well suited for any of the candidate prompts. \
REMEMBER: "next_inputs" can just be the original input \
if you don't think any modifications are needed.

<< CANDIDATE PROMPTS >>
{destinations}

REMEMBER: If destination name not there input the question

<< INPUT >>
{input}

<< OUTPUT (remember to include the ```

json

```)>>"""

# Create the router prompt template
router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(destinations=destinations_str)  # (a prompt template for the router to use)
router_prompt = PromptTemplate(template=router_template, input_variables=["input"], output_parser=RouterOutputParser())
router_chain = LLMRouterChain.from_llm(llm, router_prompt)

# Create the MultiPromptChain
chain = MultiPromptChain(
    router_chain=router_chain,
    destination_chains=destination_chains,
    default_chain=default_chain,
    verbose=True,
)

# Run the chain on some input data
physics_question = "What is black body radiation?"
chain_output = chain.invoke(physics_question)
print(chain_output)
# Output 
# Entering new MultiPromptChain chain...
# physics: {'input': 'What is black body radiation?'}
# Finished chain.
# {'input': 'What is black body radiation?',
#  'text': "Black body radiation is the electromagnetic radiation emitted by a perfect absorber of radiation, known as a black body. A black body absorbs all incoming radiation and emits radiation across the entire electromagnetic spectrum. The spectrum of black body radiation is continuous and follows a specific distribution known as Planck's law. This phenomenon is important in understanding the behavior of objects at different temperatures and is a key concept in the field of thermal physics."}

math_question = "what is 2 + 2"
chain_output = chain.invoke(math_question )
print(chain_output)
# Output 
# Entering new MultiPromptChain chain...
# math: {'input': 'what is 2 + 2'}
# Finished chain.
# {'input': 'what is 2 + 2', 'text': 'The answer to 2 + 2 is 4.'}

biology_question = "Why does every cell in our body contain DNA?"
chain_output = chain.invoke(biology_question )
print(chain_output)
# Output
# Entering new MultiPromptChain chain...
# None: {'input': 'Why does every cell in our body contain DNA?'}
# Finished chain.
# {'input': 'Why does every cell in our body contain DNA?',
#  'text': 'Every cell in our body contains DNA because DNA carries the genetic information that determines the characteristics and functions of an organism. DNA contains the instructions for building and maintaining an organism, including the proteins that are essential for cell function and structure. This genetic information is passed down from parent to offspring and is essential for the growth, development, and functioning of all cells in the body. Having DNA in every cell ensures that the genetic information is preserved and can be used to carry out the necessary processes for life.'}

在此示例中,我们首先定义了几个专用的提示模板,每个模板旨在处理特定类型的输入或任务(例如,物理问题、数学问题、历史问题、计算机科学问题)。然后,我们为每个模板创建提示信息词典,其中包括名称、描述和实际的提示模板本身。接下来,我们初始化我们的语言模型,并为每个专用提示创建目标链 (LLMChains)。这些目标链将是当输入与特定提示匹配时调用的目标链。我们还定义了一个默认链,该链将用于与任何专用提示不匹配的输入。MultiPromptChain 的核心是路由器链,它负责根据输入确定要使用的目标链。我们定义一个路由器提示模板,为路由器提供指令和格式,然后使用此模板和我们的语言模型创建 LLMRouterChain。最后,我们创建 MultiPromptChain 本身,传入路由器链、目标链和默认链。

当我们用“什么是黑体辐射?”这样的输入调用这个链时,路由器链将分析输入并确定这是一个物理问题。然后,它会将输入路由到物理目标链,该目标链将提供有关黑体辐射的详细答案。如果我们输入的问题与任何专业提示都不匹配,例如“DNA在细胞中的作用是什么?”,路由器链会将输入路由到默认链,默认链将尝试使用预训练的LLM数据提供一般答案。MultiPromptChain 允许我们通过动态路由输入到最合适的子链或提示来创建高度专业化和高效的工作流,确保每个输入都由最适合任务的组件处理。

2.路由链的优势
路由器链具有多种优势,使其成为机器学习和自然语言处理工具包中的有用工具:

1.专业化:通过将输入路由到专门的子链,您可以确保每个任务都由最适合它的组件处理,从而获得更准确和相关的结果。
2.效率:路由器链不是通过多个链或提示来运行每个输入,而是智能地将输入路由到适当的目的地,从而节省计算资源并提高整体性能。
3.灵活性:通过添加或删除目标链或提示,可以轻松扩展或修改路由器链,使其高度适应不断变化的需求或新域。
4.模块化:每个子链或提示都可以独立开发和测试,促进代码的可复用性和可维护性。
5.可扩展性:随着项目的增长和复杂性的增加,路由器链可以帮助管理和编排越来越多的专用组件,确保您的系统保持强大和高效。
借助路由器链,您可以创建复杂而智能的应用程序,这些应用程序可以处理各种输入和任务,同时保持高度的专业化和效率。

二、Linux

1.AutoDL开机

Top up first and then create a instance. 充值然后创建一个实例

  • 20G以上显存的GPU

  • 至少16GB的系统内存。对于大规模模型训练,建议使用更多内存。

  • 存储: 足够的硬盘空间来存储模型文件和数据集。建议至少100GB的可用空间。

在这里插入图片描述

创建成功后,点击JupyterLab进入终端页面,主机信息如下:

在这里插入图片描述

2.下载llama-factory

使用AutoDL自带的科学上网,学术资源加速,然后下载llama-factory

source /etc/network_turbo 
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory/
conda create --name llamaf_311 python=3.11
conda activate llamaf_311
# 如果反复出现需要“conda init”字眼,请重新开一个终端再activate并进入llama-factory目录
pip install -r requirements.txt

3.全局安装Pytorch

conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia

4.清理空间

(base) root@autodl-container-00274fbfbe-ec0fdc69:~/autodl-tmp# source ~/.bashrc
+--------------------------------------------------AutoDL--------------------------------------------------------+
目录说明:
╔═════════════════╦════════╦════╦═════════════════════════════════════════════════════════════════════════╗
║目录             ║名称    ║速度║说明                                                                     ║
╠═════════════════╬════════╬════╬═════════════════════════════════════════════════════════════════════════╣
║/                ║系 统 盘║一般║实例关机数据不会丢失,可存放代码等。会随保存镜像一起保存。               ║
║/root/autodl-tmp ║数 据 盘║ 快 ║实例关机数据不会丢失,可存放读写IO要求高的数据。但不会随保存镜像一起保存 ║
╚═════════════════╩════════╩════╩═════════════════════════════════════════════════════════════════════════╝
CPU :18 核心
内存:60 GB
GPU :NVIDIA GeForce RTX 4090 D, 1
存储:
  系 统 盘/               :52% 16G/30G
  数 据 盘/root/autodl-tmp:1% 248M/50G
+----------------------------------------------------------------------------------------------------------------+
*注意: 
1.系统盘较小请将大的数据存放于数据盘或文件存储中,重置系统时数据盘和文件存储中的数据不受影响
2.清理系统盘请参考:https://www.autodl.com/docs/qa1/
3.终端中长期执行命令请使用screen等工具开后台运行,确保程序不受SSH连接中断影响:https://www.autodl.com/docs/daemon/
(base) root@autodl-container-00274fbfbe-ec0fdc69:~/autodl-tmp#

以下两个是可以直接删除,不影响系统运行的目录,所以首先直接删除

# conda的历史包
du -sh /root/miniconda3/pkgs/ && rm -rf /root/miniconda3/pkgs/*
# jupyterlab的回收站
du -sh /root/.local/share/Trash && rm -rf /root/.local/share/Trash

5.py测试GPU是否正常运行

(base) root@local:~# conda activate llamaf_311
(llamaf_311) root@local:~# python
Python 3.11.10 (main, Oct  3 2024, 07:29:13) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.is_available())
True
>>> print(torch.cuda.get_device_name(0))
NVIDIA GeForce RTX 4090 D
>>> quit()
(llamaf_311) root@local:~# 

6.安装git-lfs

# step1 (*suit for debian series, e.g. Ubuntu)初始化镜像仓库
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
# step2 安装插件
apt-get install git-lfs
# step3 查看版本
git lfs --version
#huggingface注册账号:728067113@qq.com/kTTT169$*@

7.在huggingface上下载模型

cd ~/autodl-tmp/
git lfs install
source /etc/network_turbo
git clone https://huggingface.co/Qwen/Qwen2.5-7B

在这里插入图片描述

每秒实时查看模型下载进度:$ watch -n 1 du -sh ~/autodl-tmp/Qwen2.5-7B/

模型下载信息及文件目录大小

(base) root@local:~/autodl-tmp# git clone https://huggingface.co/Qwen/Qwen2.5-7B
Cloning into 'Qwen2.5-7B'...
remote: Enumerating objects: 43, done.
remote: Counting objects: 100% (40/40), done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 43 (delta 16), reused 0 (delta 0), pack-reused 3 (from 1)
Unpacking objects: 100% (43/43), 3.61 MiB | 876.00 KiB/s, done.

Filtering content: 100% (4/4), 14.18 GiB | 5.87 MiB/s, done.
(base) root@local:~/autodl-tmp# 

校验模型完整性:

md5sum path/to/model/file1

在训练脚本中指定模型路径

model_name_or_path = "path/to/model"

8.下载训练数据集

cd ~/autodl-tmp/
git lfs install
source /etc/network_turbo
# 一个训练根据场景编写SELECT-sql的训练集
git clone https://huggingface.co/datasets/tushkulange/query_text_to_sql

如需自定义数据集,参考LLama-Factory中: alpaca 格式和 sharegpt的规范格式

在训练脚本中指定数据集路径

dataset = "path/to/dataset.json"

9.使用原始模型直接推理

参考文章(传送门);相关文章(传送门

在进行任何形式的微调之前,首先需要对原始模型进行直接推理,以验证模型的可用性和性能

模型加载:使用transformers库加载预训练模型和tokenizer。

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "path/to/your/model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

输入准备:准备输入文本并进行tokenization。

input_text = "Hello, how are you?"
inputs = tokenizer(input_text, return_tensors="pt")

模型推理:将tokenized的输入传递给模型,并获取输出。

outputs = model.generate(**inputs)
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(decoded_output)

实际测试代码

cd ~/autodl-tmp/

touch test_model.py

vi test_model.py

conda activate llamaf_311

python test_model.py

#!/usr/bin/python
# -*- coding: utf-8 -*-

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "/root/autodl-tmp/Qwen2.5-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "Hello, how are you?"
inputs = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**inputs)
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(decoded_output)

测试效果

(base) root@local:~/autodl-tmp# python test_model.py
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████| 4/4 [00:02<00:00,  1.36it/s]
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
Hello, how are you? I'm sorry to hear that you're feeling unwell. Is there anything specific that's bothering you or any particular symptoms you're experiencing? I'm here to help in any way I can.
(base) root@local:~/autodl-tmp# 

10.基于PEFT的有监督方式微调

安装peft

pip install peft # 如果是LLama-factory中安装过requirements.txt,则会显示already installed

准备模型和数据

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import get_peft_model, LoraConfig
from datasets import load_dataset

model_name = "path/to/your/model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# /root/autodl-tmp/query_text_to_sql/train.jsonl
dataset = load_dataset('/root/autodl-tmp/query_text_to_sql')  
print(dataset['train'])  # 共有7000个元素
train_set = dataset['train'][:6000]  # 前6000
eval_set = dataset['train'][6000:]  # 6000之后的
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值