Gradio全解8——Chatbot:融合大模型的多模态聊天机器人(6)——使用显示思考和引用构建UI
本章目录如下:
- 《Chatbot:融合大模型的多模态聊天机器人(1)——gr.ChatInterface()快速创建Chatbot》
- 《Chatbot:融合大模型的多模态聊天机器人(2)——gr.ChatInterface与流行LLM库和API结合》
- 《Chatbot:融合大模型的多模态聊天机器人(3)——组件Chatbot及ChatMessage》
- 《Chatbot:融合大模型的多模态聊天机器人(4)——使用Blocks创建自定义聊天机器人》
- 《Chatbot:融合大模型的多模态聊天机器人(5)——Chatbot的特殊Events》
- 《Chatbot:融合大模型的多模态聊天机器人(6)——使用显示思考和引用构建UI》
本篇摘要
本篇介绍如何使用Gradio创建聊天机器人。
8. Chatbot:融合大模型的多模态聊天机器人
本章介绍如何使用Gradio创建聊天机器人。聊天机器人是大型语言模型(LLMs)的一个流行应用,通过Gradio,我们可以轻松构建LLM演示并与其它用户分享,或者自己使用直观的聊天机器人界面进行开发尝试。本章主要内容包括gr.ChatInterface快速创建Chatbot、与流行LLM库及API结合、使用Agents和Tools智能代理工具、使用Blocks创建Chatbot、Chatbot的特殊Events及使用显示思考和引用构建UI。
8.6 使用显示思考和引用构建UI
Gradio的Chatbot组件可以原生显示中间思考和工具使用情况(参考metadata用法),这使得它非常适合为LLM Agent和思维链(Chain-of-Thought, CoT)演示创建UI用户界面。本节将展示如何使用gr.Chatbot和gr.ChatInterface来显示思考过程和工具使用情况,包括使用显示思考和引用构建UI两种情况,关于使用LLM Agent构建UI请参考后续章节。
8.6.1 使用显示思考的LLM构建
Gradio的Chatbot组件借助metadata可以原生显示一个思考型LLM的中间思考过程,这使得它非常适合创建展示AI模型在生成响应时如何“思考”的用户界面。以下将展示如何构建一个实时显示Gemini AI思考过程的聊天机器人。
让我们创建一个完整的聊天机器人,实时显示其思考和响应。我们将使用Google的Gemini API来访问Gemini 2.0 Flash Thinking LLM,并使用Gradio构建用户界面。首先导入库和并设置Gemini客户端密钥开始,代码如下:
import os
import gradio as gr
from gradio import ChatMessage
from typing import Iterator
import google.generativeai as genai
# get Gemini API Key from the environ variable
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
genai.configure(api_key=GEMINI_API_KEY)
# we will be using the Gemini 2.0 Flash model with Thinking capabilities
model = genai.GenerativeModel("gemini-2.0-flash-thinking-exp-1219")
然后定义处理模型输出的流函数:
def format_chat_history(messages: list) -> list:
"""
Formats the chat history into a structure Gemini can understand
"""
formatted_history = []
for message in messages:
# Skip thinking messages (messages with metadata)
if not (message.get("role") == "assistant" and "metadata" in message):
formatted_history.append({
"role": "user" if message.get("role") == "user" else "assistant",
"parts": [message.get("content", "")]
})
return formatted_history
def stream_gemini_response(user_message: str, messages: list) -> Iterator[list]:
"""
Streams thoughts and response with conversation history support.
"""
try:
print(f"\n=== New Request ===")
print(f"User message: {user_message}")
# Format chat history for Gemini
chat_history = format_chat_history(messages)
# Initialize Gemini chat
chat = model.start_chat(history=chat_history)
response = chat.send_message(user_message, stream=True)
# Initialize buffers and flags
thought_buffer = ""
response_buffer = ""
thinking_complete = False
# Add initial thinking message
messages.append(ChatMessage(role="assistant",content="",
metadata={"title": "⚙️ Thinking: *The thoughts produced by the model are experimental"}))
for chunk in response:
parts = chunk.candidates[0].content.parts
current_chunk = parts[0].text
if len(parts) == 2 and not thinking_complete:
# Complete thought and start response
thought_buffer += current_chunk
print(f"\n=== Complete Thought ===\n{thought_buffer}")
messages[-1] = ChatMessage(role="assistant", content=thought_buffer,
metadata={"title": "⚙️ Thinking: *The thoughts produced by the model are experimental"})
yield messages
# Start response
response_buffer = parts[1].text
print(f"\n=== Starting Response ===\n{response_buffer}")
messages.append(ChatMessage(role="assistant", content=response_buffer))
thinking_complete = True
elif thinking_complete:
# Stream response
response_buffer += current_chunk
print(f"\n=== Response Chunk ===\n{current_chunk}")
messages[-1] = ChatMessage(role="assistant", content=response_buffer)
else:
# Stream thinking
thought_buffer += current_chunk
print(f"\n=== Thinking Chunk ===\n{current_chunk}")
messages[-1] = ChatMessage(role="assistant", content=thought_buffer,
metadata={"title": "⚙️ Thinking: *The thoughts produced by the model are experimental"})
yield messages
print(f"\n=== Final Response ===\n{response_buffer}")
except Exception as e:
print(f"\n=== Error ===\n{str(e)}")
messages.append(ChatMessage(role="assistant",
content=f"I apologize, but I encountered an error: {str(e)}"))
yield messages
def user_message(msg: str, history: list) -> tuple[str, list]:
"""Adds user message to chat history"""
history.append(ChatMessage(role="user", content=msg))
return "", history
最后创建Gradio界面:
# Create the Gradio interface
with gr.Blocks(theme=gr.themes.Citrus(), fill_height=True) as demo:
#with gr.Column():
gr.Markdown("# Chat with Gemini 2.0 Flash and See its Thoughts 💭")
chatbot = gr.Chatbot(type="messages", label="Gemini2.0 'Thinking' Chatbot",
render_markdown=True, scale=1,
avatar_images=(None,"https://lh3.googleusercontent.com/oxz0sUBF0iYoN4VvhqWTmux-cxfD1rxuYkuFEfm1SFaseXEsjjE4Je_C_V3UQPuJ87sImQK3HfQ3RXiaRnQetjaZbjJJUkiPL5jFJ1WRl5FKJZYibUA=w214-h214-n-nu")
)
with gr.Row(equal_height=True):
input_box = gr.Textbox(lines=1, label="Chat Message",
placeholder="Type your message here...", scale=4)
clear_button = gr.Button("Clear Chat", scale=1)
# Set up event handlers
msg_store = gr.State("") # Store for preserving user message
input_box.submit(lambda msg: (msg, msg, ""), # Store message and clear input
inputs=[input_box],outputs=[msg_store, input_box, input_box],queue=False
).then(user_message, # Add user message to chat
inputs=[msg_store, chatbot], outputs=[input_box, chatbot], queue=False
).then(stream_gemini_response, # Generate and stream response
inputs=[msg_store, chatbot], outputs=chatbot)
clear_button.click(lambda: ([], "", ""),
outputs=[chatbot, input_box, msg_store], queue=False)
# Launch the interface
if __name__ == "__main__":
demo.launch(debug=True)
这将创建一个具有以下功能的聊天机器人:
- 在可折叠部分中显示模型的思考过程;
- 实时流式传输思考过程和最终响应;
- 保持清晰的聊天记录。
运行截图如下:
现在我们拥有了一个不仅能实时响应用户,还能展示其思考过程的聊天机器人,从而创建更加透明和引人入胜的交互体验。查看完整的Gemini 2.0 Flash Thinking演示地址:ysharma/Gemini2-Flash-Thinking。
8.6.2 使用引用Citations构建
Gradio的聊天机器人可以显示来自大语言模型(LLM)响应的引用,非常适合用于创建展示来源文档和参考资料的界面。本小节将展示如何使用Anthropic的Claude API(启用引用功能)和Gradio构建用户界面的聊天机器人,它既能显示响应,也能显示其支持的引用。
首先,设置一个环境变量ANTHROPIC_API_KEY,并导入库和设置Anthropic客户端:
import gradio as gr
import anthropic
import base64
from typing import List, Dict, Any
# Default document content
DEFAULT_DOC = "The grass is pink and soil is green. The sky is red while the sun looks blue."
然后,设置消息格式化函数,用于处理文档:
def read_pdf_as_base64(file_path: str) -> str:
"""Read a PDF file and return its base64 encoded content."""
with open(file_path, 'rb') as file:
return base64.b64encode(file.read()).decode('utf-8')
def user_message(user_input: str, history: list, enable_citations: bool,
doc_type: str, text_content: str, pdf_file: str, api_key: str) -> tuple:
# Logging
print("\n----------- User Message -------------")
print(f"User Input: {user_input}")
print(f"Citations Enabled: {enable_citations}")
print(f"Document Type: {doc_type}")
history.append({"role": "user", "content": user_input})
return "", history
def format_message_history(history: list, enable_citations: bool,
doc_type: str, text_content: str, pdf_files: str) -> List[Dict]:
"""Convert Gradio chat history to Anthropic message format."""
formatted_messages = []
# Add previous messages
for msg in history[:-1]:
if msg["role"] == "user":
formatted_messages.append({"role": "user", "content": msg["content"]})
elif msg["role"] == "assistant":
if "metadata" not in msg or msg["metadata"] is None:
formatted_messages.append({"role": "assistant",
"content": msg["content"]})
# Prepare the latest message
latest_message = {"role": "user", "content": []}
# Add documents if citations are enabled
if enable_citations:
# Handle plain text input
if doc_type in ["plain_text", "combined"] and text_content.strip():
latest_message["content"].append({"type": "document",
"source": {"type": "text","media_type": "text/plain",
"data": text_content.strip()},
"title": "User Text Document",
"citations": {"enabled": True}
})
# Handle PDF input
if doc_type in ["pdf", "combined"] and pdf_files:
# Handle pdf_files as a list
if isinstance(pdf_files, str):
pdf_files = [pdf_files] # Convert single path to list
# Add each PDF as a separate document
for i, pdf_file in enumerate(pdf_files):
try:
pdf_base64 = read_pdf_as_base64(pdf_file)
latest_message["content"].append({"type": "document",
"source": {"type": "base64", "media_type": "application/pdf",
"data": pdf_base64},
"title": f"User PDF Document {i+1}",
"citations": {"enabled": True}
})
except Exception as e:
print(f"Error processing PDF {i+1}: {str(e)}")
continue
# If no documents were added and citations are enabled, use default document
if not latest_message["content"]:
latest_message["content"].append({"type": "document",
"source": {"type": "text", "media_type": "text/plain",
"data": DEFAULT_DOC},
"title": "Sample Document",
"citations": {"enabled": True}
})
# Add the user's question
latest_message["content"].append({"type": "text", "text": history[-1]["content"]})
formatted_messages.append(latest_message)
return formatted_messages
再次,创建机器人响应处理程序,用于处理引用:
def bot_response(history: list, enable_citations: bool, doc_type: str,
text_content: str, pdf_file: str, api_key: str) -> List[Dict[str, Any]]:
try:
if not api_key:
history.append({"role": "assistant",
"content": "Please provide your Anthropic API key to continue."})
return history
# Initialize client with provided API key
client = anthropic.Anthropic(api_key=api_key)
messages = format_message_history(history, enable_citations, doc_type, text_content, pdf_file)
response = client.messages.create(model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=messages)
# Initialize main response and citations
main_response = ""
citations = []
# Process each content block
for block in response.content:
if block.type == "text":
main_response += block.text
if enable_citations and hasattr(block, 'citations') and block.citations:
for citation in block.citations:
if citation.cited_text not in citations:
citations.append(citation.cited_text)
# Add main response
history.append({"role": "assistant", "content": main_response})
# Add citations if any were found and citations are enabled
if enable_citations and citations:
history.append({"role": "assistant",
"content": "\n".join([f"• {cite}" for cite in citations]),
"metadata": {"title": "📚 Citations"}})
return history
except Exception as e:
print(f"Error in bot_response: {str(e)}")
error_message = str(e)
if "401" in error_message:
error_message = "Invalid API key. Please check your Anthropic API key and try again."
history.append({"role": "assistant",
"content": f"I apologize, but I encountered an error: {error_message}"})
return history
最后,创建Gradio界面:
with gr.Blocks(theme="ocean", fill_height=True) as demo:
gr.Markdown("# Chat with Anthropic Claude's Citations")
with gr.Row(scale=1):
with gr.Column(scale=4):
chatbot = gr.Chatbot(type="messages", bubble_full_width=False,
show_label=False, scale=1)
msg = gr.Textbox(placeholder="Enter your message here...",
show_label=False, container=False)
with gr.Column(scale=1):
api_key = gr.Textbox(type="password", label="Anthropic API Key",
placeholder="Enter your API key", info="Your API key will not be stored",
interactive=True)
enable_citations = gr.Checkbox(label="Enable Citations",
value=True, info="Toggle citation functionality")
doc_type_radio = gr.Radio(choices=["plain_text", "pdf", "combined"],
value="plain_text", label="Document Type",
info="Choose the type of document(s) to reference")
text_input = gr.Textbox(label="Document Content", placeholder=f"Enter your document text here.\nDefault text will be picked if citations are enabled and you don't provide the documents. Default document is --{DEFAULT_DOC}", lines=10, info="Enter the text you want to reference")
pdf_input = gr.File(label="Upload PDF", file_count="multiple",
file_types=[".pdf"], type="filepath", visible=False)
clear = gr.ClearButton([msg, chatbot, text_input, pdf_input])
# Update input visibility based on settings
enable_citations.change(update_document_inputs,
inputs=[enable_citations, doc_type_radio],
outputs=[doc_type_radio, text_input, pdf_input])
doc_type_radio.change(update_document_inputs,
inputs=[enable_citations, doc_type_radio],
outputs=[doc_type_radio, text_input, pdf_input])
# Handle message submission
msg.submit(user_message,
[msg, chatbot, enable_citations, doc_type_radio, text_input, pdf_input, api_key],
[msg, chatbot], queue=False
).then(bot_response,
[chatbot, enable_citations, doc_type_radio, text_input, pdf_input, api_key],
chatbot)
if __name__ == "__main__":
demo.launch(debug=True)
注意:源代码较长,这里做了处理,只展示主要逻辑代码,这将创建一个具有以下功能的聊天机器人:
- 支持Claude引用的纯文本和PDF文档;
- 使用元数据功能在可折叠部分显示引用Citations;
- 直接从给定文档中显示来源引用。
运行截图如下所示:
引用功能citations与Gradio聊天机器人的元数据功能metadata配合得特别好,使我们通过创建可折叠部分,既保持聊天界面的简洁,又能轻松访问来源文档。现在我们拥有一个不仅能响应用户,还能显示其来源的聊天机器人,从而创建更透明和可信的交互体验,完整引用代码请参阅:ysharma/anthropic-citations-with-gradio-metadata-key。