【项目经验】小智ai MCP学习笔记

原创已于 2025-11-15 21:05:52 修改 · 1.8k 阅读

24 ·

CC 4.0 BY-SA版权

文章标签：

#人工智能 #学习 #笔记 #ESP32

于 2025-07-21 16:16:36 首次发布

部署运行你感兴趣的模型镜像

理论

1、什么是MCP

MCP(Model Context Protocol，模型上下文协议)是一种开放式协议，它实现了LLM与各种工具的调用。使LLM从对话、生成式AI变成了拥有调用三方工具的AI。用官方的比喻，MCP就是USB-C接口，只要实现了这个接口，就可以接入AI，对AI进行赋能。
MCP
其本质是统一了AI调用三方功能的接口，借助AI Agent，使得LLM可以使用三方提供的服务来处理用户提出的问题。

从上图可以看到一些MCP的相关概念
MCP server：提供服务的三方，需要实现MCP server的功能，即将提供的功能接口按照MCP协议规定的格式，告知MCP client。
MCP client：连接MCP server与LLM的桥梁，负责管理与MCP server一对一的连接。
MCP hosts：一般指AI应用，通常由AI Agent实现MCP client的功能，再由AI Agent作为MCP hosts。
除此之外，还需要知道MCP tools的概念，第三方提供的功能接口一般称为一个tool，在后面的代码中会展示这一点。

这里引用up隔壁的程序员老王的一张视频截图，很清晰的展示了从用户提问，到AI返回结果这一过程中，是如何调用三方MCP服务的。原视频：10分钟讲清楚 Prompt, Agent, MCP 是什么
在这里插入图片描述

2、小智AI MCP server

下面回到小智AI中，在虾哥提供的源码中，实现了MCP server，目前该功能还在内测中(2025年7月20日)，可以去小智官网看看使用教程。

关于MCP协议的格式这里也不再复述，菜鸟教程和xiaozhi-esp32源码的/docs/mcp-protocol.md中有非常详细的介绍。这里只关注MCP的核心逻辑。

我们来看看MCP server的源码。最关键的类就是McpServer，这个类实现了注册工具、解析响应、调用工具等功能。

class McpServer {
public:
    static McpServer& GetInstance() {
        static McpServer instance;
        return instance;
    }

    void AddCommonTools();
    void AddTool(McpTool* tool);
    void AddTool(const std::string& name, const std::string& description, const PropertyList& properties, std::function<ReturnValue(const PropertyList&)> callback);
    void ParseMessage(const cJSON* json);
    void ParseMessage(const std::string& message);

private:
    McpServer();
    ~McpServer();

    void ParseCapabilities(const cJSON* capabilities);

    void ReplyResult(int id, const std::string& result);
    void ReplyError(int id, const std::string& message);

    void GetToolsList(int id, const std::string& cursor);
    void DoToolCall(int id, const std::string& tool_name, const cJSON* tool_arguments, int stack_size);

    std::vector<McpTool*> tools_;
    std::thread tool_call_thread_;
};

AddCommonTools()
这个方法实现了注册工具的功能，在Application::Start()中调用。

    // Add MCP common tools before initializing the protocol
#if CONFIG_IOT_PROTOCOL_MCP
    McpServer::GetInstance().AddCommonTools();
#endif

其具体实现没什么神秘的，就是调用AddTool将功能接口的信息、参数和接口push进tools队列。
比如设置音量的接口：

    AddTool("self.audio_speaker.set_volume", 
        "Set the volume of the audio speaker. If the current volume is unknown, you must call `self.get_device_status` tool first and then call this tool.",
        PropertyList({
            Property("volume", kPropertyTypeInteger, 0, 100)
        }), 
        [&board](const PropertyList& properties) -> ReturnValue {
            auto codec = board.GetAudioCodec();
            codec->SetOutputVolume(properties["volume"].value<int>());
            return true;
        });

void McpServer::AddTool(McpTool* tool) {
    // Prevent adding duplicate tools
    if (std::find_if(tools_.begin(), tools_.end(), [tool](const McpTool* t) { return t->name() == tool->name(); }) != tools_.end()) {
        ESP_LOGW(TAG, "Tool %s already added", tool->name().c_str());
        return;
    }

    ESP_LOGI(TAG, "Add tool: %s", tool->name().c_str());
    tools_.push_back(tool);
}

void McpServer::AddTool(const std::string& name, const std::string& description, const PropertyList& properties, std::function<ReturnValue(const PropertyList&)> callback) {
    AddTool(new McpTool(name, description, properties, callback));
}

ParseMessage()
解析收到的JSON，对JSON格式校验，如果是调用tool就调用DoToolCall去执行对应的tool。

void McpServer::ParseMessage(const cJSON* json) {
    // Check JSONRPC version
    auto version = cJSON_GetObjectItem(json, "jsonrpc");
    if (version == nullptr || !cJSON_IsString(version) || strcmp(version->valuestring, "2.0") != 0) {
        ESP_LOGE(TAG, "Invalid JSONRPC version: %s", version ? version->valuestring : "null");
        return;
    }
    
    // Check method
    auto method = cJSON_GetObjectItem(json, "method");
    if (method == nullptr || !cJSON_IsString(method)) {
        ESP_LOGE(TAG, "Missing method");
        return;
    }
    
    ...
    
    if (method_str == "tools/call") {
    	...
        DoToolCall(id_int, std::string(tool_name->valuestring), tool_arguments, stack_size ? stack_size->valueint : DEFAULT_TOOLCALL_STACK_SIZE);
    } else {
        ESP_LOGE(TAG, "Method not implemented: %s", method_str.c_str());
        ReplyError(id_int, "Method not implemented: " + method_str);
    }
}

DoToolCall()
查找tool，创建新的线程，在新线程中调用tool中的回调函数，即三方实现的功能接口。

void McpServer::DoToolCall(int id, const std::string& tool_name, const cJSON* tool_arguments, int stack_size) {
	// 在tools中按tool_name查找tool
	auto tool_iter = std::find_if(tools_.begin(), tools_.end(), 
                                 [&tool_name](const McpTool* tool) { 
                                     return tool->name() == tool_name; 
                                 });
	// 解析回调函数的参数
    PropertyList arguments = (*tool_iter)->properties();
    try {
        for (auto& argument : arguments) {
            ...
        }
    }

    // Start a task to receive data with stack size
    esp_pthread_cfg_t cfg = esp_pthread_get_default_config();
    cfg.thread_name = "tool_call";
    cfg.stack_size = stack_size;
    cfg.prio = 1;
    esp_pthread_set_cfg(&cfg);

    // Use a thread to call the tool to avoid blocking the main thread
    tool_call_thread_ = std::thread([this, id, tool_iter, arguments = std::move(arguments)]() {
        try {
            ReplyResult(id, (*tool_iter)->Call(arguments));
        } catch (const std::exception& e) {
            ESP_LOGE(TAG, "tools/call: %s", e.what());
            ReplyError(id, e.what());
        }
    });
    tool_call_thread_.detach();
}

ReplyResult() 和 ReplyError() 就是将结果转为JSON，并通过protocol（mqtt或websocket）发送出去。

void McpServer::ReplyResult(int id, const std::string& result) {
    std::string payload = "{\"jsonrpc\":\"2.0\",\"id\":";
    payload += std::to_string(id) + ",\"result\":";
    payload += result;
    payload += "}";
    Application::GetInstance().SendMcpMessage(payload);
}

void Application::SendMcpMessage(const std::string& payload) {
    Schedule([this, payload]() {
        if (protocol_) {
            protocol_->SendMcpMessage(payload);
        }
    });
}

通过对源码的分析，我们知道了MCP server的核心逻辑，简单来说就是在server中将接口放入tools，之后由MCP client发起调用，server解析JSON后去调用对应的接口。至于client是如何知道有哪些tool的，可以在ParseMessage()中发现端倪：

if (method_str == "tools/list") {
        std::string cursor_str = "";
        if (params != nullptr) {
            auto cursor = cJSON_GetObjectItem(params, "cursor");
            if (cJSON_IsString(cursor)) {
                cursor_str = std::string(cursor->valuestring);
            }
        }
        GetToolsList(id_int, cursor_str);

具体的client代码实现还没有看，但不难猜测，client会发起一次获取tools的请求，这样就知道了有哪些tool。

下面我们就可以注册一个自己的tool，来实现对外设的控制。

实践

实现功能：语音控制灯光亮度
首先初始化RGB灯

// 配置定时器
ledc_timer_config_t ledc_timer = {
    .speed_mode = LEDC_LOW_SPEED_MODE,
    .duty_resolution = LEDC_TIMER_8_BIT,
    .timer_num = LEDC_TIMER_0,
    .freq_hz = 5000,
};
ESP_ERROR_CHECK(ledc_timer_config(&ledc_timer));

// 配置通道
ledc_channel_config_t ledc_channel={
    .gpio_num = GPIO_NUM_1,
    .speed_mode = LEDC_LOW_SPEED_MODE,
    .channel = LEDC_CHANNEL_0,
    .timer_sel = LEDC_TIMER_0,
    .duty = 0,
};
ESP_ERROR_CHECK(ledc_channel_config(&ledc_channel));

然后添加tool

AddTool("self.my_led.set_brightness",
    "Set the brightness of the blue LED. The brightness is a percentage value from 0 to 100.",
    PropertyList({
        Property("brightness", kPropertyTypeInteger, 0, 100)
    }),
    [](const PropertyList& properties) -> ReturnValue {
        uint32_t brightness = static_cast<uint32_t>(properties["brightness"].value<int>());

        ESP_LOGI(TAG, "my led set brightness %lu", brightness);
        if (brightness > 100) {
            brightness = 100;
        }
        uint32_t duty = (brightness * 255) / 100; // Convert to 8-bit duty cycle
        ESP_ERROR_CHECK(ledc_set_duty(LEDC_LOW_SPEED_MODE, LEDC_CHANNEL_0, duty));
        ESP_ERROR_CHECK(ledc_update_duty(LEDC_LOW_SPEED_MODE, LEDC_CHANNEL_0));
        return true;
    });