DWC-尝试一个小agent-CSDN博客

本文链接：https://blog.csdn.net/weixin_41804613/article/details/142604792

DWC-尝试一个小agent

前言
构建过程

前言

似乎这次的task的话是代码或者实践方面居多。
大体的内容是基于谷歌搜索，制造一个问答的agent。（需要科学上网）
前置准备是需要在这个页面，申请一个api的使用，方便后续的agent搭建。

构建过程

构建模型

对于不同的LLM模型，在外层再包了几层，首先是构建了一个抽象基类

class BaseModel:
    def __init__(self, path: str = '') -> None:
        self.path = path

    def chat(self, prompt: str, history: List[dict]):
        pass

    def load_model(self):
        pass

然后再创建一个InternLM2的类，对大语言模型多包了一层，在类中实现了chat和load_model的方法

class InternLM2Chat(BaseModel):
    def __init__(self, path: str = '') -> None:
        super().__init__(path)
        self.load_model()

    def load_model(self):
        print('================ Loading model ================')
        self.tokenizer = AutoTokenizer.from_pretrained(self.path, trust_remote_code=True)
        self.model = AutoModelForCausalLM.from_pretrained(self.path, torch_dtype=torch.float16, trust_remote_code=True).cuda().eval()
        print('================ Model loaded ================')

    def chat(self, prompt: str, history: List[dict], meta_instruction:str ='') -> str:
        response, history = self.model.chat(self.tokenizer, prompt, history, temperature=0.1, meta_instruction=meta_instruction)
        return response, history

这个包了一层的类实现了一个chat的方法，反复调那个接口，接受prompt和调用大模型，得到的结果进行返回。
由于机器已经有了一个 InternLM-20b，所以这里用的也是比官方的教程大一些的模型

构造工具

在tools.py文件中，构造一些工具，比如Google搜索。在这个文件中，构造一个Tools类。在这个类中，我们需要添加一些工具的描述信息和具体实现方式。
其中tools的代码如下

class Tools:
    def __init__(self) -> None:
        self.toolConfig = self._tools()

    def _tools(self):
        tools = [
            {
                'name_for_human': '谷歌搜索',
                'name_for_model': 'google_search',
                'description_for_model': '谷歌搜索是一个通用搜索引擎，可用于访问互联网、查询百科知识、了解时事新闻等。',
                'parameters': [
                    {
                        'name': 'search_query',
                        'description': '搜索关键词或短语',
                        'required': True,
                        'schema': {'type': 'string'},
                    }
                ],
            }
        ]
        return tools

看起来是有一个列表，把一些调用的工具的基本描述给写进来？包括了name_for_human以及name_for_model这两个参数，看到这里我暂时理解为是传递给后续的一些key的内容

构造Agent

整个agent有以下的几个func，分别是:build_system_input，parse_latest_plugin_call，call_plugin，text_completion,
分别的作用是构造系统的Prompt，解析第一次大模型返回选择的工具和工具参数，调用选择的工具，整合两次调用。
整体上的Agent的行为是React的结构。
每次用户的提问，如果需要调用工具的话，都会进行两次的大模型调用，第一次解析用户的提问，选择调用的工具和参数，第二次将工具返回的结果与用户的提问整合。这样就可以实现一个React的结构。