Gradio全解8——Chatbot：融合大模型的多模态聊天机器人（6）——使用显示思考和引用构建UI

龙焰智能

已于 2025-05-19 21:49:23 修改

阅读量1.1k

点赞数 26

文章标签： gradio LLM Agents transformers langchain gemini thinking api

于 2025-01-17 17:34:19 首次发布

本文链接：https://blog.csdn.net/shao918516/article/details/145211694

版权

Gradio全解8——Chatbot：融合大模型的多模态聊天机器人（6）——使用显示思考和引用构建UI

本篇摘要
8. Chatbot：融合大模型的多模态聊天机器人

本章目录如下：

本篇摘要

本篇介绍如何使用Gradio创建聊天机器人。

8. Chatbot：融合大模型的多模态聊天机器人

本章介绍如何使用Gradio创建聊天机器人。聊天机器人是大型语言模型（LLMs）的一个流行应用，通过Gradio，我们可以轻松构建LLM演示并与其它用户分享，或者自己使用直观的聊天机器人界面进行开发尝试。本章主要内容包括gr.ChatInterface快速创建Chatbot、与流行LLM库及API结合、使用Agents和Tools智能代理工具、使用Blocks创建Chatbot、Chatbot的特殊Events及使用显示思考和引用构建UI。

8.6 使用显示思考和引用构建UI

Gradio的Chatbot组件可以原生显示中间思考和工具使用情况（参考metadata用法），这使得它非常适合为LLM Agent和思维链（Chain-of-Thought, CoT）演示创建UI用户界面。本节将展示如何使用gr.Chatbot和gr.ChatInterface来显示思考过程和工具使用情况，包括使用显示思考和引用构建UI两种情况，关于使用LLM Agent构建UI请参考后续章节。

8.6.1 使用显示思考的LLM构建

Gradio的Chatbot组件借助metadata可以原生显示一个思考型LLM的中间思考过程，这使得它非常适合创建展示AI模型在生成响应时如何“思考”的用户界面。以下将展示如何构建一个实时显示Gemini AI思考过程的聊天机器人。

让我们创建一个完整的聊天机器人，实时显示其思考和响应。我们将使用Google的Gemini API来访问Gemini 2.0 Flash Thinking LLM，并使用Gradio构建用户界面。首先导入库和并设置Gemini客户端密钥开始，代码如下：

import os
import gradio as gr
from gradio import ChatMessage
from typing import Iterator
import google.generativeai as genai

# get Gemini API Key from the environ variable
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
genai.configure(api_key=GEMINI_API_KEY)

# we will be using the Gemini 2.0 Flash model with Thinking capabilities
model = genai.GenerativeModel("gemini-2.0-flash-thinking-exp-1219")

然后定义处理模型输出的流函数：

def format_chat_history(messages: list) -> list:
    """
    Formats the chat history into a structure Gemini can understand
    """
    formatted_history = []
    for message in messages:
        # Skip thinking messages (messages with metadata)
        if not (message.get("role") == "assistant" and "metadata" in message):
            formatted_history.append({
                "role": "user" if message.get("role") == "user" else "assistant",
                "parts": [message.get("content", "")]
            })
    return formatted_history

def stream_gemini_response(user_message: str, messages: list) -> Iterator[list]:
    """
    Streams thoughts and response with conversation history support.
    """
    try:
        print(f"\n=== New Request ===")
        print(f"User message: {user_message}")
        
        # Format chat history for Gemini
        chat_history = format_chat_history(messages)
        
        # Initialize Gemini chat
        chat = model.start_chat(history=chat_history)
        response = chat.send_message(user_message, stream=True)
        
        # Initialize buffers and flags
        thought_buffer = ""
        response_buffer = ""
        thinking_complete = False
        
        # Add initial thinking message
        messages.append(ChatMessage(role="assistant",content="",
                metadata={"title": "⚙️ Thinking: *The thoughts produced by the model are experimental"}))
        
        for chunk in response:
            parts = chunk.candidates[0].content.parts
            current_chunk = parts[0].text
            
            if len(parts) == 2 and not thinking_complete:
                # Complete thought and start response
                thought_buffer += current_chunk
                print(f"\n=== Complete Thought ===\n{thought_buffer}")
                
                messages[-1] = ChatMessage(role="assistant", content=thought_buffer,
                    metadata={"title": "⚙️ Thinking: *The thoughts produced by the model are experimental"})
                yield messages
                
                # Start response
                response_buffer = parts[1].text
                print(f"\n=== Starting Response ===\n{response_buffer}")
                
                messages.append(ChatMessage(role="assistant", content=response_buffer))
                thinking_complete = True
                
            elif thinking_complete:
                # Stream response
                response_buffer += current_chunk
                print(f"\n=== Response Chunk ===\n{current_chunk}")
                
                messages[-1] = ChatMessage(role="assistant", content=response_buffer)
                
            else:
                # Stream thinking
                thought_buffer += current_chunk
                print(f"\n=== Thinking Chunk ===\n{current_chunk}")
                
                messages[-1] = ChatMessage(role="assistant", content=thought_buffer,
                    metadata={"title": "⚙️ Thinking: *The thoughts produced by the model are experimental"})
            
            yield messages
            
        print(f"\n=== Final Response ===\n{response_buffer}")
                
    except Exception as e:
        print(f"\n=== Error ===\n{str(e)}")
        messages.append(ChatMessage(role="assistant",
                content=f"I apologize, but I encountered an error: {str(e)}"))
        yield messages

def user_message(msg: str, history: list) -> tuple[str, list]:
    """Adds user message to chat history"""
    history.append(ChatMessage(role="user", content=msg))
    return "", history

最后创建Gradio界面：

# Create the Gradio interface
with gr.Blocks(theme=gr.themes.Citrus(), fill_height=True) as demo:
  #with gr.Column():
    gr.Markdown("# Chat with Gemini 2.0 Flash and See its Thoughts 💭")

    chatbot = gr.Chatbot(type="messages", label="Gemini2.0 'Thinking' Chatbot",
        render_markdown=True, scale=1,
        avatar_images=(None,"https://lh3.googleusercontent.com/oxz0sUBF0iYoN4VvhqWTmux-cxfD1rxuYkuFEfm1SFaseXEsjjE4Je_C_V3UQPuJ87sImQK3HfQ3RXiaRnQetjaZbjJJUkiPL5jFJ1WRl5FKJZYibUA=w214-h214-n-nu")
    )

    with gr.Row(equal_height=True):
        input_box = gr.Textbox(lines=1, label="Chat Message",
            placeholder="Type your message here...", scale=4)
        clear_button = gr.Button("Clear Chat", scale=1)

    # Set up event handlers
    msg_store = gr.State("")  # Store for preserving user message
    
    input_box.submit(lambda msg: (msg, msg, ""),  # Store message and clear input
        inputs=[input_box],outputs=[msg_store, input_box, input_box],queue=False
    ).then(user_message,  # Add user message to chat
        inputs=[msg_store, chatbot], outputs=[input_box, chatbot], queue=False
    ).then(stream_gemini_response,  # Generate and stream response
        inputs=[msg_store, chatbot], outputs=chatbot)

    clear_button.click(lambda: ([], "", ""),
        outputs=[chatbot, input_box, msg_store], queue=False)

# Launch the interface
if __name__ == "__main__":
    demo.launch(debug=True)

这将创建一个具有以下功能的聊天机器人：

在可折叠部分中显示模型的思考过程；
实时流式传输思考过程和最终响应；
保持清晰的聊天记录。

运行截图如下：
在这里插入图片描述
现在我们拥有了一个不仅能实时响应用户，还能展示其思考过程的聊天机器人，从而创建更加透明和引人入胜的交互体验。查看完整的Gemini 2.0 Flash Thinking演示地址：ysharma/Gemini2-Flash-Thinking。

8.6.2 使用引用Citations构建

Gradio的聊天机器人可以显示来自大语言模型（LLM）响应的引用，非常适合用于创建展示来源文档和参考资料的界面。本小节将展示如何使用Anthropic的Claude API（启用引用功能）和Gradio构建用户界面的聊天机器人，它既能显示响应，也能显示其支持的引用。

首先，设置一个环境变量ANTHROPIC_API_KEY，并导入库和设置Anthropic客户端：

import gradio as gr
import anthropic
import base64
from typing import List, Dict, Any

# Default document content
DEFAULT_DOC = "The grass is pink and soil is green. The sky is red while the sun looks blue."

然后，设置消息格式化函数，用于处理文档：

def read_pdf_as_base64(file_path: str) -> str:
    """Read a PDF file and return its base64 encoded content."""
    with open(file_path, 'rb') as file:
        return base64.b64encode(file.read()).decode('utf-8')
        
def user_message(user_input: str, history: list, enable_citations: bool,
    doc_type: str, text_content: str, pdf_file: str, api_key: str) -> tuple:
    # Logging 
    print("\n----------- User Message -------------")
    print(f"User Input: {user_input}")
    print(f"Citations Enabled: {enable_citations}")
    print(f"Document Type: {doc_type}")

    history.append({"role": "user", "content": user_input})
    return "", history

def format_message_history(history: list, enable_citations: bool,
    doc_type: str, text_content: str, pdf_files: str) -> List[Dict]:
    """Convert Gradio chat history to Anthropic message format."""
    formatted_messages = []

    # Add previous messages
    for msg in history[:-1]:
        if msg["role"] == "user":
            formatted_messages.append({"role": "user", "content": msg["content"]})
        elif msg["role"] == "assistant":
            if "metadata" not in msg or msg["metadata"] is None:
                formatted_messages.append({"role": "assistant",
                    "content": msg["content"]})

    # Prepare the latest message
    latest_message = {"role": "user", "content": []}

    # Add documents if citations are enabled
    if enable_citations:
        # Handle plain text input
        if doc_type in ["plain_text", "combined"] and text_content.strip():
            latest_message["content"].append({"type": "document",
                "source": {"type": "text","media_type": "text/plain",
                    "data": text_content.strip()},
                "title": "User Text Document",
                "citations": {"enabled": True}
            })

        # Handle PDF input
        if doc_type in ["pdf", "combined"] and pdf_files:
            # Handle pdf_files as a list
            if isinstance(pdf_files, str):
                pdf_files = [pdf_files]  # Convert single path to list
            
            # Add each PDF as a separate document
            for i, pdf_file in enumerate(pdf_files):
                try:
                    pdf_base64 = read_pdf_as_base64(pdf_file)
                    latest_message["content"].append({"type": "document",
                        "source": {"type": "base64", "media_type": "application/pdf",
                            "data": pdf_base64},
                        "title": f"User PDF Document {i+1}",
                        "citations": {"enabled": True}
                    })
                except Exception as e:
                    print(f"Error processing PDF {i+1}: {str(e)}")
                    continue

        # If no documents were added and citations are enabled, use default document
        if not latest_message["content"]:
            latest_message["content"].append({"type": "document",
                "source": {"type": "text", "media_type": "text/plain",
                    "data": DEFAULT_DOC},
                "title": "Sample Document",
                "citations": {"enabled": True}
            })

    # Add the user's question
    latest_message["content"].append({"type": "text", "text": history[-1]["content"]})
    formatted_messages.append(latest_message)
    return formatted_messages

再次，创建机器人响应处理程序，用于处理引用：

def bot_response(history: list, enable_citations: bool, doc_type: str,
    text_content: str, pdf_file: str, api_key: str) -> List[Dict[str, Any]]:
    try:
        if not api_key:
            history.append({"role": "assistant",
                "content": "Please provide your Anthropic API key to continue."})
            return history

        # Initialize client with provided API key
        client = anthropic.Anthropic(api_key=api_key)
        messages = format_message_history(history, enable_citations, doc_type, text_content, pdf_file)

        response = client.messages.create(model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=messages)

        # Initialize main response and citations
        main_response = ""
        citations = []

        # Process each content block
        for block in response.content:
            if block.type == "text":
                main_response += block.text
                if enable_citations and hasattr(block, 'citations') and block.citations:
                    for citation in block.citations:
                        if citation.cited_text not in citations:
                            citations.append(citation.cited_text)

        # Add main response
        history.append({"role": "assistant", "content": main_response})

        # Add citations if any were found and citations are enabled
        if enable_citations and citations:
            history.append({"role": "assistant",
                "content": "\n".join([f"• {cite}" for cite in citations]),
                "metadata": {"title": "📚 Citations"}})

        return history

    except Exception as e:
        print(f"Error in bot_response: {str(e)}")
        error_message = str(e)
        if "401" in error_message:
            error_message = "Invalid API key. Please check your Anthropic API key and try again."
        history.append({"role": "assistant",
            "content": f"I apologize, but I encountered an error: {error_message}"})
        return history

最后，创建Gradio界面：

with gr.Blocks(theme="ocean", fill_height=True) as demo:
    gr.Markdown("# Chat with Anthropic Claude's Citations")

    with gr.Row(scale=1):
        with gr.Column(scale=4):
            chatbot = gr.Chatbot(type="messages", bubble_full_width=False,
                show_label=False, scale=1)

            msg = gr.Textbox(placeholder="Enter your message here...",
                show_label=False, container=False)

        with gr.Column(scale=1):
            api_key = gr.Textbox(type="password", label="Anthropic API Key",
                placeholder="Enter your API key", info="Your API key will not be stored",
                interactive=True)

            enable_citations = gr.Checkbox(label="Enable Citations",
                value=True, info="Toggle citation functionality")

            doc_type_radio = gr.Radio(choices=["plain_text", "pdf", "combined"],
                value="plain_text", label="Document Type",
                info="Choose the type of document(s) to reference")

            text_input = gr.Textbox(label="Document Content", placeholder=f"Enter your document text here.\nDefault text will be picked if citations are enabled and you don't provide the documents. Default document is --{DEFAULT_DOC}", lines=10, info="Enter the text you want to reference")

            pdf_input = gr.File(label="Upload PDF", file_count="multiple",
                file_types=[".pdf"], type="filepath", visible=False)

    clear = gr.ClearButton([msg, chatbot, text_input, pdf_input])

    # Update input visibility based on settings
    enable_citations.change(update_document_inputs,
        inputs=[enable_citations, doc_type_radio],
        outputs=[doc_type_radio, text_input, pdf_input])

    doc_type_radio.change(update_document_inputs,
        inputs=[enable_citations, doc_type_radio],
        outputs=[doc_type_radio, text_input, pdf_input])

    # Handle message submission
    msg.submit(user_message,
        [msg, chatbot, enable_citations, doc_type_radio, text_input, pdf_input, api_key],
        [msg, chatbot], queue=False
    ).then(bot_response,
        [chatbot, enable_citations, doc_type_radio, text_input, pdf_input, api_key],
        chatbot)

if __name__ == "__main__":
    demo.launch(debug=True)

注意：源代码较长，这里做了处理，只展示主要逻辑代码，这将创建一个具有以下功能的聊天机器人：

支持Claude引用的纯文本和PDF文档；
使用元数据功能在可折叠部分显示引用Citations；
直接从给定文档中显示来源引用。

运行截图如下所示：
在这里插入图片描述
引用功能citations与Gradio聊天机器人的元数据功能metadata配合得特别好，使我们通过创建可折叠部分，既保持聊天界面的简洁，又能轻松访问来源文档。现在我们拥有一个不仅能响应用户，还能显示其来源的聊天机器人，从而创建更透明和可信的交互体验，完整引用代码请参阅：ysharma/anthropic-citations-with-gradio-metadata-key。

参考文献

Gradio - guides - Chatbots