编辑任何东西

最新推荐文章于 2024-08-10 08:15:34 发布

丶张豪哥

最新推荐文章于 2024-08-10 08:15:34 发布

阅读量490

点赞数

文章标签：深度学习机器学习计算机视觉

原文链接：https://github.com/sail-sg/EditAnything

版权

Segment Anything是一个项目，允许用户通过文本引导编辑和生成图像内容。利用ControlNet、BLIP2和Stable Diffusion等技术，支持对象级和部件级编辑。用户可以提供文本提示和分割掩码来改变图像的特定部分，实现精细的图像编辑和创作。

摘要由CSDN通过智能技术生成

segment这是一个正在进行的项目，旨在编辑和生成图像中的任何内容，由Segment Anything、ControlNet、 BLIP2、Stable Diffusion等提供支持。

一个有趣的项目。非常欢迎任何形式的贡献和建议！

消息

2023/04/12 - 文本引导编辑的初始版本在sam2groundingdino_edit.py（对象级）和sam2vlpart_edit.py（部件级）中。

2023/04/10 - edit-anything 的初始版本在sam2edit.py.

2023/04/10 - 我们将预训练模型转换为扩散器样式，使用时自动加载预训练模型sam2image_diffuser.py。现在您可以轻松地将我们的预训练模型与不同的基础模型结合起来！

2023/04/09 - 我们发布了一个基于 StableDiffusion 的 ControlNet 预训练模型，该模型生成由 SAM 分割调节的图像。

特征

突出特点：

以 SAM 掩码为条件的预训练 ControlNet 可以通过细粒度控制生成图像。
与类别无关的 SAM 掩码支持更多形式的编辑和生成。
BLIP2 文本生成可实现无文本引导控制。

通过 Text-Grounding 和 Segment-Anything 编辑特定事物

通过文本引导的零件遮罩进行编辑

文本接地：“狗头”

人类提示：“可爱的狗”

文本接地：“猫眼”

人类提示：“一只可爱的小人形猫”

通过文本引导的对象遮罩进行编辑

文本接地：“长凳”

人类提示：“长凳”

按段编辑任何内容

人类提示：“灿烂的夕阳天空，红砖墙”

人类提示：“湖边的椅子，晴天，春天” edit-anything 的初始版本。（我们将很快添加更多对蒙版的控制。）

通过 Segment-Anything 生成任何东西

BLIP2 Prompt: "a large white and red ferry"

(1: input image; 2: segmentation mask; 3-8: generated images.)

BLIP2 提示：“多云的天空”

BLIP2提示：“一架黑色无人机在蓝天飞翔”

人工提示和 BLIP2 生成的提示构建文本指令。
SAM 模型对输入图像进行分割，生成没有类别的分割掩码。
分割掩码和文本指令指导图像生成。

注意：由于 SAM 数据集中的隐私保护，生成的图像中的人脸也被模糊了。我们正在训练具有清晰图像的新模型来解决这个问题。

进行中

在 SAM 数据集中使用 85k 个样本训练的条件生成。
使用来自 LAION 和 SAM 的更多图像进行训练。
对不同蒙版的交互控制以进行图像编辑。
使用Grounding DINO进行类别相关的自动编辑。
ChatGPT 引导图像编辑。

设置

创建环境

    conda env create -f environment.yaml
    conda activate control

安装 BLIP2 和 SAM

将这些模型放在models文件夹中。

pip install git+https://github.com/huggingface/transformers.git

pip install git+https://github.com/facebookresearch/segment-anything.git

# For text-guided editing
pip install git+https://github.com/openai/CLIP.git

pip install git+https://github.com/facebookresearch/detectron2.git

pip install git+https://github.com/IDEA-Research/GroundingDINO.git

下载预训练模型

# Segment-anything ViT-H SAM model. 
cd models/
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

# BLIP2 model will be auto downloaded.

# Part Grounding Swin-Base Model.
wget https://github.com/Cheems-Seminar/segment-anything-and-name-it/releases/download/v1.0/swinbase_part_0a0000.pth

# Grounding DINO Model.
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth

# Get edit-anything-ckpt-v0-1.ckpt pretrained model from huggingface. 
# No need to download this if your are using sam2image_diffuser.py!!! But please install safetensors for reading the ckpt.
https://huggingface.co/shgao/edit-anything-v0-1

运行演示

python sam2image_diffuser.py
# or 
python sam2image.py
# or 
python sam2edit.py
# or
python sam2vlpart_edit.py
# or
python sam2groundingdino_edit.py

如果您有 GUI 来运行 gradio 演示，请在这些文件中设置“use_gradio = True”。

训练

生成训练数据集dataset_build.py。
segmentsegmentControlNet用传输稳定扩散模型tool_add_control_sd21.py。
训练模型sam_train_sd21.py。

致谢

本项目基于：

Segment Anything , ControlNet , BLIP2 , MDT , 稳定扩散, 大规模无监督语义分割, Grounded Segment Anything：从对象到零件, Grounded-Segment-Anything

感谢这些惊人的项目！

丶张豪哥

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
1
评论
编辑任何东西

这是一个正在进行的项目，旨在编辑和生成图像中的任何内容，由Segment Anything、ControlNet、 BLIP2、Stable Diffusion等提供支持。一个有趣的项目。非常欢迎任何形式的贡献和建议！
复制链接

扫一扫