仅做笔记用:Stable Diffusion 通过 ControlNet 扩展图片 / 扩图

发觉之前的 Outpainting 脚本效果仍旧不是很理想。这里又找了一下有没有效果更好的途径来扩图。于是就找到了通过 ControlNet 的方式来实现效果更好的扩图。这里临时记录一下在 Stable Diffusion 怎么使用 ControlNet 来扩展图片。

下载 control_v11p_sd15_inpaint_fp16.safetentors,放到 SD 目录的 \models\ControlNet目录。在这里插入图片描述
在 SD WebUI 里面在 ControlNet 勾选启用。选择局部重绘,然后看一看预处理器是不是 inpaint_only,模型是不是刚才下载的文件,如果不是就改一下。

下边三个滑块不用管。控制模式选择优先 ControlNet,画面缩放模式这里一般选择缩放并填充在这里插入图片描述
然后在上面的作图尺寸里面根据想要的长宽比输入需要的尺寸(点击图片右下方最右边的按钮,可以直接将导入的图片的尺寸发送到上面的尺寸处)。当然,如果可能的话,请尽量在提示词里面写好背景方面的提示词。然后开始生成。

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

可以将生成的图继续放到ControlNet的图像里面,然后重复步骤4,继续扩展图片!

不同的采样方法效果差别有些大,可以尝试更换不同的采样方法来达到更好的效果。

<!-- !CODE 8d8e30a8-71de-491c-94ad-fbce69abf4a5 -->
### Stable Diffusion ControlNet Model Usage and Implementation #### Overview of ControlNet Integration with Stable Diffusion ControlNet is a plugin designed to enhance the capabilities of generative models like Stable Diffusion by providing additional guidance during image generation. This allows for more controlled outcomes, such as preserving specific structures or styles from input images while generating new content[^2]. #### Installation Requirements To use ControlNet alongside Stable Diffusion, ensure that all necessary dependencies are installed. The environment setup typically involves installing Python packages related to deep learning frameworks (e.g., PyTorch), along with libraries specifically required for handling image data. For instance, one can set up an environment using pip commands similar to those found in Hugging Face's diffusers repository: ```bash pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117 pip install transformers accelerate safetensors datasets ``` Additionally, clone the relevant repositories containing both `stable-diffusion` and `controlnet` implementations: ```bash git clone https://github.com/huggingface/diffusers.git cd diffusers/examples/community/ git clone https://github.com/Mikubill/sd-webui-controlnet.git ``` #### Basic Workflow Using ControlNet The workflow generally includes preparing inputs suitable for conditioning purposes within the diffusion process. For example, when working on edge detection tasks, preprocess your source material into formats compatible with what ControlNet expects – often grayscale images representing edges extracted via Canny filters or other methods. Here’s how you might implement this step programmatically: ```python from PIL import Image import numpy as np import cv2 def prepare_canny_edges(image_path): img = cv2.imread(image_path) gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) edges = cv2.Canny(gray, 100, 200) # Convert back to RGB format expected by some pipelines edged_img = cv2.cvtColor(edges, cv2.COLOR_GRAY2RGB) return Image.fromarray(edged_img.astype('uint8'), 'RGB') ``` Afterwards, integrate these processed inputs directly into the pipeline configuration provided by either custom scripts derived from community contributions or official examples available through platforms like GitHub. #### Advanced Customization Options Beyond basic integration, users may explore advanced customization options offered by developers who have extended functionalities beyond initial designs. These enhancements could involve modifying architectures slightly differently than originally proposed or incorporating novel techniques aimed at improving performance metrics across various benchmarks. One notable advancement comes from research efforts focused on depth estimation problems where researchers introduced Depth-Anything—a robust single-view depth prediction framework capable of producing high-quality results under diverse conditions without requiring extensive retraining processes per dataset encountered[^3]. Such advancements indirectly benefit projects involving conditional GANs since better quality auxiliary information leads to improved final outputs. --related questions-- 1. How does integrating multiple types of conditioners affect the output diversity in generated images? 2. What preprocessing steps should be taken before feeding real-world photographs into ControlNet-enhanced models? 3. Can pre-trained weights from different domains improve cross-domain adaptation performances significantly? 4. Are there any limitations associated with current versions of ControlNet regarding supported modalities?
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值