AI学习：图像生成 chat3.5

Z !

已于 2023-12-05 16:40:16 修改

阅读量1.1k

点赞数 20

分类专栏： ai 文章标签：人工智能学习

于 2023-12-05 16:11:13 首次发布

本文链接：https://blog.csdn.net/weixin_42954448/article/details/134808701

版权

ai 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

介绍

Images API 提供了三种与图像交互的方法：

根据文本提示从头开始创建图像（适用于DALL·E 3和DALL·E 2）。
通过使模型替换预先存在图像的某些区域，基于新的文本提示来创建图像的编辑版本（仅适用于DALL·E 2）。
创建现有图像的变体（仅适用于DALL·E 2）。

图像生成

图像生成端点允许您根据文本提示创建原始图像。在使用DALL·E 3时，图像的大小可以是1024x1024、1024x1792或1792x1024像素。
默认情况下，图像以标准质量生成，但在使用DALL·E 3时，您可以设置quality: "hd"以获得增强的细节。方形、标准质量的图像生成速度最快。
您可以使用DALL·E 3一次请求1张图像（通过进行并行请求可以请求更多），或者使用DALL·E 2并通过参数n一次请求最多10张图像。

from openai import OpenAI

client = OpenAI(
  api_key="sk-##########################",
)

# 用法
response = client.images.generate(
  model="dall-e-3",
  prompt="a white siamese cat",
  size="1024x1024",
  quality="standard",
  n=1,
)

image_url = response.data[0].url
print(response);

ImagesResponse(created=1701761542, data=[Image(b64_json=None, revised_prompt="A well groomed Siamese cat with a dominantly white coat is standing elegantly. Its blue almond-shaped eyes are piercing and sharp, while the distinctive color points on its ears, face, paws, and tail possess a contrasting darker hue. It has a sleek, short, and finely textured coat, a muscular, slim body and the cat exudes a royal touch in its overall demeanor, reflecting the breed's noble history.", url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-Nyt7Lgiuy9Eeae2x3OXf2x7y/user-465jKD0bH6jMGgQxlfgS7JoH/img-xoMl87aBUv3hDUkpI4w4Vtkn.png?st=2023-12-05T06%3A32%3A22Z&se=2023-12-05T08%3A32%3A22Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-04T12%3A48%3A54Z&ske=2023-12-05T12%3A48%3A54Z&sks=b&skv=2021-08-06&sig=vD0H61U46QuP2oH60N%2B%2B74tQbsTjemASF4YrUNNa9Xg%3D')])

在这里插入图片描述
请求参数

参数	类型	必填	说明
`prompt`	string	必填	对所需图像的文字描述。dall-e-2最大长度为1000个字符，dall-e-3最大长度为4000个字符
`model`	string	可选	用于图像生成的模型，默认为dall-e-2
`n`	integer or null	可选	要生成的图像数量。必须在1和10之间。对于dall-e-3，仅支持n=1，默认为1
`quality`	string	可选	将生成的图像质量。hd创建具有更精细细节和整体一致性的图像。此参数仅对dall-e-3支持，默认为standard
`response_format`	string or null	可选	返回生成的图像的格式。必须是url或b64_json之一，默认为url
`size`	string or null	可选	生成图像的尺寸。对于dall-e-2，必须是256x256、512x512或1024x1024之一。对于dall-e-3模型，必须是1024x1024、1792x1024或1024x1792之一，默认为1024x1024
`style`	string or null	可选	生成图像的风格。必须是vivid或natural之一。Vivid使模型倾向于生成超现实和戏剧性的图像。Natural使模型生成更自然、不那么超现实的图像。此参数仅对dall-e-3支持，默认为vivid
`user`	string	可选	代表您的最终用户的唯一标识符，有助于OpenAI监控和检测滥用，了解更多

返回参数

参数	类型	说明
`b64_json`	string	如果response_format是b64_json生成的图像的Base64编码的JSON
`url`	string	如果response_format是url 生成的图像的URL（默认）。
`revised_prompt`	string	如果对提示进行了任何修改用于生成图像的提示。

图像变体

检测图片的大小和图片的格式，图片要为png格式且不能大于4MB

import os
from PIL import Image
def get_image_size_in_mb(image_path):
    try:
        size_in_bytes = os.path.getsize(image_path)
        size_in_mb = size_in_bytes / (1024 * 1024)
        return size_in_mb
    except Exception as e:
        print(f"无法获取图片大小: {e}")
        return None


image_path = "image.png"
image_size_in_mb = get_image_size_in_mb(image_path)

if image_size_in_mb is not None:
    print(f"图片大小：{image_size_in_mb:.2f} MB")
    
def is_png(image_path):
    try:
        with Image.open(image_path) as img:
            return img.format == "PNG"
    except Exception as e:
        print(f"无法打开图片 {image_path}: {e}")
        return False

is_png_format = is_png(image_path)

if is_png_format:
    print("图片是PNG格式")
else:
    print("图片不是PNG格式")

response = client.images.create_variation(
  image=open("image.png", "rb"),
  n=1,
  size="512x512"
)
image_url = response.data[0].url
print(response);

ImagesResponse(created=1701763066, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-Nyt7Lgiuy9Eeae2x3OXf2x7y/user-465jKD0bH6jMGgQxlfgS7JoH/img-JYnrvgWjz2s97ucJjdyYjhky.png?st=2023-12-05T06%3A57%3A46Z&se=2023-12-05T08%3A57%3A46Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-04T23%3A07%3A58Z&ske=2023-12-05T23%3A07%3A58Z&sks=b&skv=2021-08-06&sig=9oXioWW4VOliOHnk8wN7NDF7Cu9PiRKGcVTJHvIxiyI%3D')])

请求参数

参数	类型	必填	默认	说明
`image`	file	必填		用作变体基础的图像。必须是有效的PNG文件，小于4MB，并且是正方形
`model`	string	可选	默认为dall-e-2	用于图像生成的模型，目前仅支持dall-e-2。
`n`	integer or null	可选	默认为1	要生成的图像数量，必须在1和10之间。对于dall-e-3，仅支持n=1。
`response_format`	string or null	可选	默认为url	返回生成的图像的格式。必须是url或b64_json之一。
`size`	string or null	可选	默认为1024x1024	生成图像的尺寸。必须是256x256、512x512或1024x1024之一。
`user`	string	可选		代表您的最终用户的唯一标识符，有助于OpenAI监控和检测滥用。

图像编辑

client.images.edit(
  image=open("otter.png", "rb"),
  mask=open("mask.png", "rb"),
  prompt="A cute baby sea otter wearing a beret",
  n=2,
  size="1024x1024"
)
print( response.data[0].url);

ImagesResponse(created=1701763066, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-Nyt7Lgiuy9Eeae2x3OXf2x7y/user-465jKD0bH6jMGgQxlfgS7JoH/img-JYnrvgWjz2s97ucJjdyYjhky.png?st=2023-12-05T06%3A57%3A46Z&se=2023-12-05T08%3A57%3A46Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-04T23%3A07%3A58Z&ske=2023-12-05T23%3A07%3A58Z&sks=b&skv=2021-08-06&sig=9oXioWW4VOliOHnk8wN7NDF7Cu9PiRKGcVTJHvIxiyI%3D')])

请求参数

参数	类型	必填	默认	说明
`image`	file	必填		要编辑的图像。必须是有效的PNG文件，小于4MB，并且是正方形。如果未提供`mask`，则图像必须具有透明度，透明度将被用作掩码。
`prompt`	string	必填		所需图像的文字描述。最大长度为1000个字符。
`mask`	file	可选		另一幅图像，其完全透明的区域（例如alpha为零的区域）指示应编辑图像的位置。必须是有效的PNG文件，小于4MB，并且与图像具有相同的尺寸。
`model`	string	可选	默认为dall-e-2	用于图像生成的模型。目前仅支持dall-e-2。
`n`	integer or null	可选	默认为1	要生成的图像数量，必须在1和10之间。
`response_format`	string or null	可选	默认为url	返回生成的图像的格式。必须是url或b64_json之一。
`size`	string or null	可选	默认为1024x1024	生成图像的尺寸。必须是256x256、512x512或1024x1024之一。
`user`	string	可选		代表您的最终用户的唯一标识符，有助于OpenAI监控和检测滥用。