【Stable diffusion源码解读一：gradio使用方法】

最新推荐文章于 2024-08-09 20:38:01 发布

Casia_Dominic

最新推荐文章于 2024-08-09 20:38:01 发布

阅读量610

点赞数

文章标签： stable diffusion

原文链接：https://zhuanlan.zhihu.com/p/617742414

版权

本文介绍了Stable-Diffusion-WebUI如何利用开源项目Gradio构建强大的web界面，通过实例展示了Gradio的基本用法，包括文本输入、图像输出以及组件的组织。文章还概述了webui的启动流程和组件实现，对代码结构进行了简要分析。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

收起

1、Gradio的基本用法

Gradio是什么

Gradio的使用方法

webui的界面实现

AUTOMATIC1111的webui是近期很流行的stable-diffusion应用，它集合stable-diffusion各项常用功能，还通过扩展的形式支持controlnet、lora等技术。下图是stable-diffusion-webui的界面，可见功能强大。

by 罗培羽

stable-diffusion-webui（webui）的github地址如下，它更新很频繁，本次源码分析使用的是2023.2.27克隆的版本。虽然后续后更新，但这个版本已较为稳定。

https://github.com/AUTOMATIC1111/stable-diffusion-webui github.com/AUTOMATIC1111/stable-diffusion-webui

源码分析将从以下8个方面进行，出于时间精力限制，分析不会很详细，仅仅是对其代码的脉络结构做分析，以便在需要修改时能够找对地方。同时，也作为一个学习总结。

1、Gradio的基本用法

Gradio是什么

stable-diffusion-webui基于开源项目Gradio搭建，Gradio是一个适合展示深度学习任务的python网页UI框架，通过简单几行代码就能够构建界面，如下是它官网上一个例子，通过4行代码，就能构建一个拥有多个框的界面。

gradio的官网如下：

http://gradio.app gradio.app

打开webui的源码，例如ui.py，即可看到import gradio as gr等字样。显然，webui是基于gradio来构建界面。

Gradio的使用方法

下面通过几个例子来说明Gradio的用法，只有理解了Gradio的用法，才有可能去分析webui界面相关的源码。

第一个例子

如下所示是一个使用Gradio的简单例子，通过gr.Interface去定义界面的组件，这里定义一个文本框（text）作为输入，一个图像框（image）作为输出，中间的执行函数是generator，即生成一张100*100的图片。

import gradio as gr
import numpy as np

def generator(text):
image = np.ones((100,100,3),np.uint8)
return image

interface = gr.Interface(fn=generator, inputs=“text”, outputs=“image”)
interface.launch(server_port=1234,server_name=“0.0.0.0”)

运行后在浏览器打开，会呈现如下界面。

点击”submit“，则展现生成出来的图片。

第二个例子

第二个例子来自https://gradio.app/quickstart/，展示稍微复杂一点点的界面功能，从代码中可见，相比于上一个例子，它有文本框（text）和图片框（image）两个输入。

import gradio as gr

import numpy as np

def generator(text,image):
image = np.ones((100,100,3),np.uint8)
image = image+1
return image

interface = gr.Interface(fn=generator, inputs=[“text”,“image”], outputs=“image”)
interface.launch(server_port=1234,server_name=“0.0.0.0”)

那么画出来的界面如下图所示。

第三个例子

下面的例子展示使用gr.Blocks和界面组件（gr.Textbox、gr.Button）构建界面的用法，可以自行编排这些组件，从而构造出自己想要的界面功能。

import gradio as gr

def greet(name):
return "Hello " + name + “!”

with gr.Blocks() as demo:
name = gr.Textbox(label=“Name”)
output = gr.Textbox(label=“Output Box”)
greet_btn = gr.Button(“Greet”)
greet_btn.click(fn=greet, inputs=name, outputs=output)

demo.launch(server_port=1234,server_name=“0.0.0.0”)

运行结果如下图所示。

第四个例子

下面是最后一个例子，它来自gradio官网上，通过gr.Tab()来构建标签栏，通过gr.Row()来实现布局，通过gr.Markdown()来显示提示文本。

import numpy as np

import gradio as gr

def flip_text(x):
return x[::-1]

def flip_image(x):
return np.fliplr(x)

with gr.Blocks() as demo:
gr.Markdown(“Flip text or image files using this demo.”)
with gr.Tab(“Flip Text”):
text_input = gr.Textbox()
text_output = gr.Textbox()
text_button = gr.Button(“Flip”)
with gr.Tab(“Flip Image”):
with gr.Row():
image_input = gr.Image()
image_output = gr.Image()
image_button = gr.Button(“Flip”)

with gr.Accordion("Open for More!"):
    gr.Markdown("Look at me...")

text_button.click(flip_text, inputs=text_input, outputs=text_output)
image_button.click(flip_image, inputs=image_input, outputs=image_output)

demo.launch(server_port=1234,server_name=“0.0.0.0”)

运行结果如下图所示。

通过上述几个例子，相信大家对Gradio的用法和功能有基本的了解，那接下来便可以看看webui中调用Gradio的相关内容。

webui的界面实现

我们从webui启动过程来一步步跟踪，如下图所示，程序启动后会进入到webui.py，其中通过命令行来判断是进入api模式（api_only）还是进入界面模式，如果是界面模式，就调用webui()方法。

进入webui方法，可以看到下述框起来的量巨，其中shared.demo.launch是Gradio的方法，即启动界面，这与前面几个例子的最后一句代码一样。而modules.ui.create_ui()便是构建ui界面的方法。

modules.ui.create_ui()方法对应的是modules/ui.py中的方法，如下所示，从中可见webui便是使用gr.Blocks等组件来构建UI界面。下图可以看到，每一个标签对应着一个Block，例如文生图标签下的内容就定义在第一个框住的代码块中（txt2img_interface），而图生图标签下的内容是定义在第二个框住的代码块中（img2img_interface）。

上述代码红框代码块对应的标签如下图所示。

我们来看一个具体的例子，打开文生图的界面代码块，即可看到各种组件的定义，例如图中框选的就算宽度（Width）和高度（Height）组件的代码。当然，从这里也可以看出，代码写的还是比较乱的，各种嵌套。

宽度（Width）和高度（Height）组件的在界面中的位置如下图所示。

另外，由于txt2img和img2img都需要输入prompt、negative_prompt，还都有一个Generate按钮，webui也将这部分内容统一成一个方法（create_toprow），以达到代码复用的目的。图中框住的内容即是prompt、negative_prompt和Generate按钮对应的组件。