How-Diffusion-Models-Work

Notes from How Diffusion Models Work by DeepLearning.ai

Contents

Intuition

Sampling

  • With Extra Noise

 explorer_pC0437cXSo.mp4 

Training

Context Embedding

Faster Sampling


Notes

Taught By Sharon Zhou

Noted by Atul

​编辑

  • Example used throughout the course: Generate 16X16 size sprites for video games.

Intuition

  • Goal : Given a lot of sprite images, generate even more sprite images

​编辑

  • What does the network learn?

    • Fine details
    • General outline
    • Everything in between
  • Noising Process (bob as ink drop analogy)

​编辑

  • Denoising Process (what should the NN think?)

    • If its' Bob the sprite, keep it as it is
    • If its likely to be Bob, suggest more details to be filled
    • If its just an outline of a sprite, suggest general details for likely sprite(bob/fred/...)
    • If its nothing, suggest outline of a sprite
  • Give the NN input noise, whose pixels are obtained from Normal distribution, and get a completely new sprite !

Sampling

  • Assume you have a trained NN
  • At each denoising step, it predicts noise, and subtracts it to get a better image
  • NOTE: At each denoising step, some random noise is added again to prevent "mode collapse"

Neural Network

  • UNet Architecture
    • Input and output of same size
    • First used for image segmentation

​编辑

  • Takes a noisy image, embeds into small space by downsampling, and upsamples to predict noise

  • Can take more info. in form of embeddings

    • Time: related to timestep, and noise level added
    • Context: guides generation process
  • Checkout forward() in sampling notebook

​编辑

Training

Learns the distribution of what is "not noise"

  • Sample training image, timestep t, and noise, randomly
    • Timestep helps control level of noise
    • randomisation ensures a stable model
  • Add noise to image
  • Input this into NN, which predicts the noise
  • Compute loss between actual and predicted noise
  • Backprop and learn

​编辑

Control

  • Embeddings are vectors , for instance, strings represented as number vectors
  • Given as input to NN along with training image
  • Get associated with a training example, and its properties
  • Uses: Generate funky mixtures by combining embeddings
  • Context formats
    • Text
    • Categories, one hot encoded (Eg. hero, non-hero, spells ...)

​编辑

Fast Sampling : DDIM

  • DDPM is slow!
    • Multiple timesteps, and markovian nature
  • Skips steps, making the process deterministic
  • Lower quality than DDPM

Summary

Other applications : Music, Inpainting, Textual Inversion

  • 16
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AI周红伟

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值