深度评测云平台上Claude 3.7 Sonnet模型的混合推理功能（下）

最新推荐文章于 2025-05-18 20:17:36 发布

佛州小李哥

最新推荐文章于 2025-05-18 20:17:36 发布

阅读量867

点赞数 18

分类专栏： AWS技术文章标签：人工智能科技亚马逊云科技 aws ai 云计算语言模型

本文链接：https://blog.csdn.net/m0_66628975/article/details/145916922

版权

AWS技术专栏收录该内容

186 篇文章

订阅专栏

在本系列的上篇中，我们通过一个代码示例将对比Claude 3.7 Sonnet在启用和未启用推理功能时的API响应上的不同。帮助于我们理解Claude 3.7 Sonnect标准模式与扩展思维模式之间的区别。接下来我们将再通过一个代码示例展示，Claide是如何结合其推理能力判断并使用工具进行推理的能力。

代码示例：结合额外工具进行推理

我们在本示例中展示了Claude 3.7 Sonnet如何结合其推理能力去判断是否要工具，以及其使用工具回复用户的能力。以下示例体现了Claude模型如何在解决问题之前进行思考，并决定是否需要调用工具来解决问题。

import boto3
import json
from botocore.exceptions import ClientError

def tool_use_with_reasoning():
    """
    Demonstrates how to use Claude 3.7 Sonnet with tools and reasoning enabled.
    Shows how the model thinks through a problem before using tools.
    """
    # Create the Amazon Bedrock runtime client
    client = boto3.client("bedrock-runtime", region_name="us-east-1")
    
    # Specify the model ID for Claude 3.7 Sonnet
    model_id = "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
    
    # Example prompt that requires tools
    prompt = "I need to calculate the compound interest on an investment of $5,000 with an annual interest rate of 6.5% compounded monthly for 8 years."
    
    # Define a calculator tool
    tools = [
        {
            "toolSpec": {
                "name": "calculator",
                "description": "Evaluate mathematical expressions and return the result.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "expression": {
                                "type": "string",
                                "description": "The mathematical expression to evaluate."
                            }
                        },
                        "required": ["expression"]
                    }
                }
            }
        }
    ]
    
    try:
        # Send initial request with tools and reasoning enabled
        print(f"PROMPT: {prompt}\n")
        print("Sending request with tools and reasoning enabled...")
        
        response = client.converse(
            modelId=model_id,
            messages=[
                {
                    "role": "user",
                    "content": [{"text": prompt}]
                }
            ],
            inferenceConfig={"maxTokens": 8000},  # Must be higher than budget_tokens
            toolConfig={"tools": tools},  # Define available tools
            additionalModelRequestFields={
                "thinking": {
                    "type": "enabled",
                    "budget_tokens": 4000  # Allocate tokens for thinking
                }
            }
        )
        
        # Check if the model wants to use a tool
        if response["stopReason"] == "tool_use":
            content_blocks = response["output"]["message"]["content"]
            thinking = None
            tool_use = None
            
            # Extract thinking content and remove it from content_blocks
            filtered_content_blocks = []
            for block in content_blocks:
                if "reasoningContent" in block:
                    thinking = block["reasoningContent"]["reasoningText"]["text"]
                else:
                    filtered_content_blocks.append(block)
            
            # Now find the tool use in the filtered blocks
            for block in filtered_content_blocks:
                if "toolUse" in block:
                    tool_use = block["toolUse"]
            
            # Display the thinking process
            if thinking:
                print("--- THINKING PROCESS ---")
                print(thinking)
                print()
            
            # Handle the tool use request
            if tool_use:
                tool_name = tool_use.get("name")
                tool_input = tool_use.get("input", {})
                tool_id = tool_use.get("toolUseId")
                
                print(f"--- TOOL REQUEST ---")
                print(f"Tool: {tool_name}")
                print(f"Input: {json.dumps(tool_input, indent=2)}")
                
                # Execute the calculator tool
                if tool_name == "calculator":
                    expression = tool_input.get("expression", "")
                    try:
                        # Python uses ** for exponentiation, not ^
                        # Replace ^ with ** for Python evaluation
                        expression = expression.replace("^", "**")
                        result = str(eval(expression))
                        print(f"Result: {result}")
                    except Exception as e:
                        result = f"Error: {str(e)}"
                        print(f"Result: {result}")
                else:
                    # If we don't recognize the tool, return an error
                    result = "Error: Unknown tool"
                    print(f"Result: {result}")
                
                # Send follow-up with tool result
                print("\nSending follow-up with tool result...")
                
                follow_up_response = client.converse(
                    modelId=model_id,
                    messages=[
                        {
                            "role": "user",
                            "content": [{"text": prompt}]
                        },
                        {
                            "role": "assistant",
                            "content": filtered_content_blocks  # Use filtered blocks without thinking
                        },
                        {
                            "role": "user",
                            "content": [
                                {
                                    "toolResult": {  # Provide tool result in the correct format
                                        "toolUseId": tool_id,
                                        "content": [{"text": result}]
                                    }
                                }
                            ]
                        }
                    ],
                    inferenceConfig={"maxTokens": 8000},
                    toolConfig={"tools": tools}  # Must include toolConfig in follow-up
                )
                
                # Extract and display the final response
                final_text = None
                for block in follow_up_response["output"]["message"]["content"]:
                    if "text" in block:
                        final_text = block["text"]
                        break
                
                print("\n--- FINAL RESPONSE ---")
                print(final_text)
            else:
                print("--- NO TOOL USE REQUESTED ---")
                print("The model responded with a stop reason of 'tool_use' but no tool use block was found.")
        else:
            # If the model didn't request a tool, display the direct response
            content_blocks = response["output"]["message"]["content"]
            thinking = None
            response_text = None
            
            for block in content_blocks:
                if "reasoningContent" in block:
                    thinking = block["reasoningContent"]["reasoningText"]["text"]
                elif "text" in block:
                    response_text = block["text"]
            
            if thinking:
                print("--- THINKING PROCESS ---")
                print(thinking)
                print()
            
            print("--- RESPONSE (No tool used) ---")
            print(response_text)
    
    except (ClientError, Exception) as e:
        print(f"ERROR: Can't invoke Claude 3.7 Sonnet. Reason: {e}")
        exit(1)

if __name__ == "__main__":
    tool_use_with_reasoning()

输出分析：结合额外工具执行推理

我进行了一个测试实验，我输入的提示词是关于投资复利计算的请求：

我需要计算一笔$5,000投资的复利，年利率为6.5%，按月复利计算，为期8年。

Claude 3.7 Sonnet首先对问题进行了思考，然后开始使用计算器工具来执行计算。

思考过程：

To calculate compound interest, I'll use the formula: A = P(1 + r/n)^(nt)

Where:
- A = final amount
- P = principal (initial investment)
- r = annual interest rate (as a decimal)
- n = number of times interest is compounded per year
- t = time in years

Given information:
- Principal (P) = $5,000
- Annual interest rate (r) = 6.5% = 0.065
- Compounding frequency (n) = 12 (monthly)
- Time (t) = 8 years

I need to calculate: A = 5000 * (1 + 0.065/12)^(12*8)

I'll use the calculator function to evaluate this expression

工具请求：

Tool: calculator
Input: {
"expression": "5000 * (1 + 0.065/12)^(12*8)"
}
Result: 8398.344845047079

最终响应：

For your investment of $5,000 with an annual interest rate of 6.5% compounded monthly over 8 years:

- Initial investment: $5,000
- Final value: $8,398.34
- Compound interest earned: $3,398.34

The calculation used the compound interest formula: P(1 + r/n)^(nt), where:
- P = Principal ($5,000)
- r = Annual interest rate (6.5% or 0.065)
- n = Number of times interest is compounded per year (12 for monthly compounding)
- t = Time in years (8 years)

分析

以上实验示例展示了Claude 3.7 Sonnet的推理能力如何结合调用工具进行推理：

结构化问题求解 – Claude首先分解问题，识别所需的公式和变量，然后判断需要计算工具的帮助。
正确的工具选择 – 识别出计算器工具是求解该数学表达式的最佳选择。
公式转换 – Claude正确地将复利计算公式 A = P(1 + r/n)^(nt) 转换为可计算的数学表达式。
完整的响应 – 在获得计算结果后，Claude提供了一份详细的答案，包含计算步骤和最终结果。
工具使用的完整流程 – 该过程展示了从推理→工具请求→结果处理→最终回答的完整生命周期。
智能推理判断 – Claude并非盲目调用工具，而是先进行推理，判断工具的必要性，然后利用工具进行求解，并最终解释结果。

值得关注的API实施细节

在使用Claude 3.7 Sonnet的推理能力时，需要注意以下技术细节：

推理与参数设置：
- 推理模式不兼容 temperature、top_p、top_k 等参数调整，同时无法强制工具使用。在比较标准模式和推理模式时，我使用了默认参数，以确保公平对比。
预算Token设置：
- 需要通过budget_tokens参数分配推理预算。
- 最低要求为 1,024 tokens，但对于复杂问题，推荐使用 4,000+ tokens。
最大Token值设置：
- maxTokens 必须大于 budget_tokens，通常推荐 maxTokens 至少为 budget_tokens 的两倍。
工具结果的后续请求：
- 在后续请求中使用工具结果时，必须过滤掉推理内容(reasoningContent blocks)，以避免验证错误。
工具配置在后续请求中的一致性：
- 发送工具结果回给模型时，必须在请求中包含相同的 toolConfig。
Python指数运算符的区别：
- Claude使用 ^ 作为指数运算符，而Python使用 **，代码会自动处理这一转换。

扩展推理能力的应用场景

Claude 3.7 Sonnet的混合推理能力为多个应用场景提供了新的可能性：

教育工具 – 通过逐步推理过程，向学生展示复杂问题的解题思路。
研究助手 – 拆解复杂的研究问题，并将其转化为逻辑步骤。
数学与科学问题求解 – 处理多步骤计算，并提供透明的推理过程。
决策透明化 – 解释AI做出建议或结论的推理过程，增强可信度。
复杂规划 – 生成详细的计划，并清楚解释每一步的合理性。

使用Claude 3.7时的最佳实践

为了最大化Claude 3.7 Sonnet的推理能力，请遵循以下最佳实践：

1. 根据问题复杂度调整预算：

复杂问题需要 6,000+ tokens 的推理预算，简单问题可以使用较低的预算。

2. 显式请求逐步思考：

在提示词中使用“逐步思考(Think step by step)”或“展示你的工作思维链(Show your work)”，可以引导模型提供更详细的推理。

3. 权衡性能与成本：

详细推理会增加token消耗和响应时间，因此应在真正需要深入推理时才启用。

4. 检查推理过程：

查看Claude的推理过程，有助于发现可能存在的逻辑漏洞或错误，而这些问题可能在最终回答中不会显现。

5. 编写健壮的代码：

在生产环境中使用推理能力和工具时，应提前考虑应对不同的响应结构和潜在回复错误的异常处理方案，以确保代码的稳定性。

结论

Claude 3.7 Sonnet的混合推理能力是AI推理透明性的重要进步。通过逐步推理过程，Claude让开发者和用户更好地理解、验证和信任其输出结果。本实验展示了推理能力在独立使用和与工具结合使用方面的的案例。无论是在计算复杂数学表达式，还是提供透明的逻辑思考过程方面，Claude 3.7 Sonnet都能提供更加清晰和可信的答案。

随着大家在Amazon Bedrock上探索Claude 3.7 Sonnet的应用，可以思考如何利用推理能力来增强大家开发的AI应用程序，实现更深层次的内容洞察、更强的解释能力，以及更加透明的问题求解流程。大家是否尝试过Claude 3.7 Sonnet的推理能力？有哪些有趣的应用场景？欢迎在评论区分享大家的体验！