使用Google Imagen进行AI图像生成和编辑_使用 imagen api生成图片-CSDN博客

在当今的AI技术中，图像生成和编辑已经成为一个重要领域。Google Imagen是一个能够将文本提示转化为高质量视觉资产的生成模型。通过与Langchain的结合，开发者可以轻松实现以下功能：

文本生成图像：只需通过文本提示生成新的图像。
图像编辑：使用文本提示编辑上传或生成的图像。
图像描述：获取图像的文本描述。
图像问答：关于图像的问题寻求答案。

核心原理解析

Google Imagen结合了深度学习的能力来理解文本并生成相对应的视觉输出。通过这个强大的模型，开发者可以将用户的创意快速转化为视觉素材。

代码实现演示

文本生成图像

生成图像的过程非常简单。首先，我们需要通过文本提示调用VertexAIImageGeneratorChat模型。以下是完整的代码示例：

from langchain_core.messages import AIMessage, HumanMessage
from langchain_google_vertexai.vision_models import VertexAIImageGeneratorChat
import base64
import io
from PIL import Image

# 创建图像生成对象
generator = VertexAIImageGeneratorChat()

# 提供文本输入生成图像
messages = [HumanMessage(content=["a cat at the beach"])]
response = generator.invoke(messages)

# 解析响应对象并获取base64图像字符串
generated_image = response.content[0]
img_base64 = generated_image["image_url"]["url"].split(",")[-1]

# 将base64字符串转换为图像
img = Image.open(io.BytesIO(base64.decodebytes(bytes(img_base64, "utf-8"))))

# 显示图像
img.show()

图像编辑

使用生成的图像进行编辑同样简单。我们可以通过下面的代码实现对图像的编辑：

from langchain_google_vertexai.vision_models import VertexAIImageEditorChat

# 创建图像编辑对象
editor = VertexAIImageEditorChat()

# 编辑生成的图像
messages = [HumanMessage(content=[generated_image, "a dog at the beach"])]
editor_response = editor.invoke(messages)

# 获取编辑后的图像
edited_img_base64 = editor_response.content[0]["image_url"]["url"].split(",")[-1]

# 将base64字符串转换为图像
edited_img = Image.open(io.BytesIO(base64.decodebytes(bytes(edited_img_base64, "utf-8"))))

# 显示编辑后的图像
edited_img.show()