Amazon Generative AI 基于 Amazon 扩散模型原理的代码实践之采样篇(1)

本文链接：https://blog.csdn.net/2401_84263282/article/details/139928194

要让神经网络做到这一点，就是要让它学会去除添加的噪声。从“No Idea”这个图像开始（这时只是纯粹的噪声），到开始看起来像里面可能有精灵，再到长得像精灵 Bob ，到最后就是精灵 Bob。

这里要强调的是：“No Idea”这个图像的噪声非常重要，因为它是正态分布（normal distribution）的。换句话说，也就是这个图像的像素每一个都是从正态分布（又称 “高斯分布”）中采样的。

因此，当你希望神经网络生成一个新的精灵时，比如精灵 Fred ，你可以从该正态分布中采样噪声，然后你可以使用神经网络逐渐去除噪声来获得一个全新的精灵！除了你训练过的所有精灵之外，你还可以获得更多的精灵。

恭喜你，你已经找到了生成大量的全新美丽精灵的理论方法！接下来就是代码实践了。

在下一章里，我们将用代码展示为了实现正态分布噪声采样，而主动在迭代阶段添加噪声的方法；和没有添加噪声方法的模型输出结果对比测试。这将是一次很有趣和难忘的扩散模型工作原理奇妙体验。

3、噪声采样的代码实践

首先我们将讨论采样。我们将详细介绍采样的细节以及它在多个不同的迭代中是如何工作的。

3.1、创建 Amazon SageMaker Notebook 实例

篇幅所限，本文不再赘述如何创建 Amazon SageMaker Notebook 实例。

如需详细了解，可参考以下官方文档：
步骤 1：创建 Amazon SageMaker 笔记本实例 - Amazon SageMaker

3.2、代码说明

本实验的完整示例代码可参考：
https://github.com/hanyun2019/difussion-model-code-implementation/blob/dm-project-haowen-mac/L1_Sampling.ipynb

示例代码的 notebook 在 Amazon SageMaker Notebook 测试通过，内核为 conda_pytorch_p310 ，实例为一台 ml.g5.2xlarge 实例，如下图所示。

3.3、采样过程说明

首先假设你有一个噪声样本（noise sample），你把这个噪声样本输入到一个已经训练好的神经网络中。这个神经网络已经知道精灵图像的样子，它接下来的主要工作是预测噪声。请注意：这个神经网络预测的是噪声而不是精灵图像，然后我们从噪声样本中减去预测的噪声，来得到更像精灵图像的输出结果。

由于只是对噪声的预测，它并不能完全消除所有噪声，因此需要多个步骤才能获得高质量的样本。比如我们希望在 500 次这样的迭代之后，能够得到看起来非常像精灵图像的输出结果。

我们先看一段伪代码，从算法实现上高屋建瓴地看下整个逻辑结构：

首先我们以随机采样噪声样本（random noise sample）的方式，开始这段旅程。

如果你看过一些关于穿越时间旅行的电影，这整个过程很像是一段时间旅行。想像一下你有一杯墨汁，我们实际上是在用时光倒退（step backwards）的方式；它最初是完全扩散的漆黑墨汁，然后我们会一直追溯到有第一滴墨汁滴入一杯清水的那个最初时分。

然后，我们将采样一些额外噪声（extra noise）。为什么我们需要添加一些额外噪声，这其实是一个很有趣的话题，我们会在本文的后面部分详细探讨这个话题。

这是你实际将原始噪声、那个样本传递回神经网络的地方，然后你会得到一些预测的噪声。而这种预测噪声是经过训练的神经网络想要从原始噪声中减去的噪声，以在最后得到看起来更像精灵图像的输出结果。

最后我们还会用到一种名为 “DDPM” 的采样算法，它代表降噪扩散概率模型。

3.4、导入所需的库文件

现在我们进入通过代码解读扩散模型的部分。首先，我们需要导入 PyTorch 和一些 PyTorch 相关的实用库，以及导入帮助我们设计神经网络的一些辅助函数（helper functions）。

from typing import Dict, Tuple
from tqdm import tqdm
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import models, transforms
from torchvision.utils import save_image, make_grid
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation, PillowWriter
import numpy as np
from IPython.display import HTML
from diffusion_utilities import *

3.5、神经网络架构设计

现在我们来设置神经网络，我们要用它来采样。

class ContextUnet(nn.Module):
    def __init__(self, in_channels, n_feat=256, n_cfeat=10, height=28):  # cfeat - context features
        super(ContextUnet, self).__init__()

        # number of input channels, number of intermediate feature maps and number of classes
        self.in_channels = in_channels
        self.n_feat = n_feat
        self.n_cfeat = n_cfeat
        self.h = height  #assume h == w. must be divisible by 4, so 28,24,20,16...

        # Initialize the initial convolutional layer
        self.init_conv = ResidualConvBlock(in_channels, n_feat, is_res=True)
# Initialize the down-sampling path of the U-Net with two levels
        self.down1 = UnetDown(n_feat, n_feat)        # down1 #[10, 256, 8, 8]
        self.down2 = UnetDown(n_feat, 2 * n_feat)    # down2 #[10, 256, 4,  4]
        
         # original: self.to_vec = nn.Sequential(nn.AvgPool2d(7), nn.GELU())
        self.to_vec = nn.Sequential(nn.AvgPool2d((4)), nn.GELU())

        # Embed the timestep and context labels with a one-layer fully connected neural network
        self.timeembed1 = EmbedFC(1, 2*n_feat)
        self.timeembed2 = EmbedFC(1, 1*n_feat)
        self.contextembed1 = EmbedFC(n_cfeat, 2*n_feat)
        self.contextembed2 = EmbedFC(n_cfeat, 1*n_feat)

        # Initialize the up-sampling path of the U-Net with three levels
        self.up0 = nn.Sequential(
            nn.ConvTranspose2d(2 * n_feat, 2 * n_feat, self.h//4, self.h//4), # up-sample  
            nn.GroupNorm(8, 2 * n_feat), # normalize                       
            nn.ReLU(),
        )
        self.up1 = UnetUp(4 * n_feat, n_feat)
        self.up2 = UnetUp(2 * n_feat, n_feat)

        # Initialize the final convolutional layers to map to the same number of channels as the input image
        self.out = nn.Sequential(
            nn.Conv2d(2 * n_feat, n_feat, 3, 1, 1), # reduce number of feature maps   #in_channels, out_channels, kernel_size, stride=1, padding=0
            nn.GroupNorm(8, n_feat), # normalize
            nn.ReLU(),
            nn.Conv2d(n_feat, self.in_channels, 3, 1, 1), # map to same number of channels as input
        )

    def forward(self, x, t, c=None):
        """
        x : (batch, n_feat, h, w) : input image
        t : (batch, n_cfeat)      : time step
        c : (batch, n_classes)    : context label
        """
        # x is the input image, c is the context label, t is the timestep, context_mask says which samples to block the context on

        # pass the input image through the initial convolutional layer
        x = self.init_conv(x)
        # pass the result through the down-sampling path
        down1 = self.down1(x)       #[10, 256, 8, 8]
        down2 = self.down2(down1)   #[10, 256, 4, 4]
        
        # convert the feature maps to a vector and apply an activation
        hiddenvec = self.to_vec(down2)
        
        # mask out context if context_mask == 1
        if c is None:
            c = torch.zeros(x.shape[0], self.n_cfeat).to(x)
            
        # embed context and timestep
        cemb1 = self.contextembed1(c).view(-1, self.n_feat * 2, 1, 1)     # (batch, 2*n_feat, 1,1)
        temb1 = self.timeembed1(t).view(-1, self.n_feat * 2, 1, 1)
        cemb2 = self.contextembed2(c).view(-1, self.n_feat, 1, 1)
        temb2 = self.timeembed2(t).view(-1, self.n_feat, 1, 1)
        #print(f"uunet forward: cemb1 {cemb1.shape}. temb1 {temb1.shape}, cemb2 {cemb2.shape}. temb2 {temb2.shape}")

        up1 = self.up0(hiddenvec)
        up2 = self.up1(cemb1*up1 + temb1, down2)  # add and multiply embeddings
        up3 = self.up2(cemb2*up2 + temb2, down1)
        out = self.out(torch.cat((up3, x), 1))
        return out

3.6、设置模型训练的超参数

接下来，我们将设置模型训练需要的一些超参数，包括：时间步长、图像尺寸等。

如果对照 DDPM 的论文，其中定义了一个 noise schedule 的概念， noise schedule 决定了在特定时间里步长对图像施加的噪点水平。因此，这部分只是构造一些你记得的缩放因子的 DDPM 算法参数。那些缩放值 S1、S2、S3 ，这些缩放值是在 noise schedule 中计算的。它之所以被称为 “Schedule”，是因为它取决于时间步长。

《DDPM》

https://arxiv.org/pdf/2006.11239.pdf

超参数介绍：

beta1：DDPM 算法的超参数
beta2：DDPM 算法的超参数
height：图像的长度和高度
noise schedule（噪声调度）：确定在某个时间步长应用于图像的噪声级别；
S1，S2，S3：缩放因子的值

如下面代码所示，我们在这里设置的时间步长（timesteps）是 500 ；图像尺寸参数 height 设置为 16 ，表示这是 16 乘 16 的正方形图像；DDPM 的超参数 beta1 和 beta2 等等。

# hyperparameters

# diffusion hyperparameters
timesteps = 500
beta1 = 1e-4
beta2 = 0.02

# network hyperparameters
device = torch.device("cuda:0" if torch.cuda.is_available() else torch.device('cpu'))
n_feat = 64 # 64 hidden dimension feature
n_cfeat = 5 # context vector is of size 5
height = 16 # 16x16 image
save_dir = './weights/'

请记住，你正在浏览 500 次的步骤，因为你正在经历你在这里看到的缓慢去除噪音的 500 次迭代。

以下代码块将构建 DDPM 论文中定义的时间步长（noise schedule）：

# construct DDPM noise schedule
b_t = (beta2 - beta1) * torch.linspace(0, 1, timesteps + 1, device=device) + beta1
a_t = 1 - b_t
ab_t = torch.cumsum(a_t.log(), dim=0).exp()    
ab_t[0] = 1

接下来实例化模型：

# construct model
nn_model = ContextUnet(in_channels=3, n_feat=n_feat, n_cfeat=n_cfeat, height=height).to(device)

3.7、添加额外噪声的输出测试

首先测试的是添加额外噪声的输出测试。可以重点关注下变量 z 。

在每次迭代之后，我们通过设置“z = torch.randn_like(x)”来添加额外的采样噪声，以让噪声输入符合正态分布：

# helper function; removes the predicted noise (but adds some noise back in to avoid collapse)
def denoise_add_noise(x, t, pred_noise, z=None):
    if z is None:
        z = torch.randn_like(x)
    noise = b_t.sqrt()[t] * z
    mean = (x - pred_noise * ((1 - a_t[t]) / (1 - ab_t[t]).sqrt())) / a_t[t].sqrt()

接下来加载该模型：

# load in model weights and set to eval mode
nn_model.load_state_dict(torch.load(f"{save_dir}/model_trained.pth", map_location=device))
nn_model.eval()
print("Loaded in Model")

以下代码段实现了前面介绍过的 DDPM 采样算法：

# sample using standard algorithm
@torch.no_grad()
def sample_ddpm(n_sample, save_rate=20):
    # x_T ~ N(0, 1), sample initial noise
    samples = torch.randn(n_sample, 3, height, height).to(device)  

    # array to keep track of generated steps for plotting
    intermediate = [] 
    for i in range(timesteps, 0, -1):
        print(f'sampling timestep {i:3d}', end='\r')

        # reshape time tensor
        t = torch.tensor([i / timesteps])[:, None, None, None].to(device)

        # sample some random noise to inject back in. For i = 1, don't add back in noise
        z = torch.randn_like(samples) if i > 1 else 0

        eps = nn_model(samples, t)    # predict noise e_(x_t,t)
        samples = denoise_add_noise(samples, i, eps, z)
        if i % save_rate ==0 or i==timesteps or i<8:
            intermediate.append(samples.detach().cpu().numpy())

    intermediate = np.stack(intermediate)
    return samples, intermediate

运行模型以获得预测的噪声：

eps = nn_model(samples, t)    # predict noise e_(x_t,t)

最后降噪：

samples = denoise_add_noise(samples, i, eps, z)

如何自学黑客&网络安全

黑客零基础入门学习路线&规划

初级黑客
1、网络安全理论知识（2天）
①了解行业相关背景，前景，确定发展方向。
②学习网络安全相关法律法规。
③网络安全运营的概念。
④等保简介、等保规定、流程和规范。（非常重要）

2、渗透测试基础（一周）
①渗透测试的流程、分类、标准
②信息收集技术：主动/被动信息搜集、Nmap工具、Google Hacking
③漏洞扫描、漏洞利用、原理，利用方法、工具（MSF）、绕过IDS和反病毒侦察
④主机攻防演练：MS17-010、MS08-067、MS10-046、MS12-20等

3、操作系统基础（一周）
①Windows系统常见功能和命令
②Kali Linux系统常见功能和命令
③操作系统安全（系统入侵排查/系统加固基础）

4、计算机网络基础（一周）
①计算机网络基础、协议和架构
②网络通信原理、OSI模型、数据转发流程
③常见协议解析（HTTP、TCP/IP、ARP等）
④网络攻击技术与网络安全防御技术
⑤Web漏洞原理与防御：主动/被动攻击、DDOS攻击、CVE漏洞复现

5、数据库基础操作（2天）
①数据库基础
②SQL语言基础
③数据库安全加固

6、Web渗透（1周）
①HTML、CSS和JavaScript简介
②OWASP Top10
③Web漏洞扫描工具
④Web渗透工具：Nmap、BurpSuite、SQLMap、其他（菜刀、漏扫等）
恭喜你，如果学到这里，你基本可以从事一份网络安全相关的工作，比如渗透测试、Web 渗透、安全服务、安全分析等岗位；如果等保模块学的好，还可以从事等保工程师。薪资区间6k-15k

到此为止，大概1个月的时间。你已经成为了一名“脚本小子”。那么你还想往下探索吗？

如果你想要入坑黑客&网络安全，笔者给大家准备了一份：282G全网最全的网络安全资料包评论区留言即可领取！

7、脚本编程（初级/中级/高级）
在网络安全领域。是否具备编程能力是“脚本小子”和真正黑客的本质区别。在实际的渗透测试过程中，面对复杂多变的网络环境，当常用工具不能满足实际需求的时候，往往需要对现有工具进行扩展，或者编写符合我们要求的工具、自动化脚本，这个时候就需要具备一定的编程能力。在分秒必争的CTF竞赛中，想要高效地使用自制的脚本工具来实现各种目的，更是需要拥有编程能力.

如果你零基础入门，笔者建议选择脚本语言Python/PHP/Go/Java中的一种，对常用库进行编程学习；搭建开发环境和选择IDE,PHP环境推荐Wamp和XAMPP， IDE强烈推荐Sublime；·Python编程学习，学习内容包含：语法、正则、文件、网络、多线程等常用库，推荐《Python核心编程》，不要看完；·用Python编写漏洞的exp,然后写一个简单的网络爬虫；·PHP基本语法学习并书写一个简单的博客系统；熟悉MVC架构，并试着学习一个PHP框架或者Python框架 (可选)；·了解Bootstrap的布局或者CSS。

8、超级黑客
这部分内容对零基础的同学来说还比较遥远，就不展开细说了，附上学习路线。