PyTorch自用笔记(第七周)

本文介绍了PyTorch中自编码器的应用,包括无监督学习、Auto-Encoder的原理与变体,如Denoising和Adversarial AutoEncoders。此外,还探讨了变分Auto-Encoder的训练过程及其损失函数。最后,简要提及了GAN(生成对抗网络)及其在数据分布和图像生成中的角色。
摘要由CSDN通过智能技术生成

十三、自编码器Auto-Encoders

13.1 无监督学习

机器学习的三大领域:
增强学习:智能体与环境交互
监督学习:分类和回归,有标签,依赖于人的主观判断
无监督学习:无标签;作用:降维;预处理;可视化

13.2 Auto-Encoder

在这里插入图片描述
降维:将高维数据用低维特征表示
损失函数:
在这里插入图片描述
注:简单的像素级别的重建有可能只是记住图像中的一些特征,生成结果并不稳定
变体:
Denoising AutoEncoders:给原始图像加入噪声后进行重建
Dropout AutoEncoders:将神经网络中某些weight暂设为0
Adversarial AutoEncoders:添加一个鉴别器,计算真实图片和生成图片的差别

13.3 变分Auto-Encoder

VAE.py

import torch
from torch import nn


class VAE(nn.Module):

    def __init__(self):
        super(VAE, self).__init__()

        # [b, 784] => [b, 20]
        # u: [b, 10]
        # sigma: [b, 10]
        self.encoder = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 64),
            nn.ReLU(),
            nn.Linear(64, 20),
            nn.ReLU()
        )
        # [b, 20] => [b, 784]
        self.decoder = nn.Sequential(
            nn.Linear(10, 64),
            nn.ReLU(),
            nn.Linear(64, 256),
            nn.ReLU(),
            nn.Linear(256, 784),
            nn.Sigmoid()
        )

        self.criteon = nn.MSELoss()

    def forward(self, x):
        """

        :param x: [b, 1, 28, 28]
        :return:
        """
        batchsz = x.size(0)
        # flatten
        x = x.view(batchsz, 784)
        # encoder
        # [b, 20], including mean and sigma
        h_ = self.encoder(x)
        # [b, 20] => [b, 10] and [b, 10]
        mu, sigma = h_.chunk(2, dim=1)  # 拆分
        # reparametrize trick, epison~N(0, 1)
        h = mu + sigma * torch.randn_like(sigma)  # 可导sample
        # decoder
        x_hat = self.decoder(h)
        # reshape
        x_hat = x_hat.view(batchsz, 1, 28, 28)

        # kld公式
        kld = 0.5 * torch.sum(
            torch.pow(mu, 2) +
            torch.pow(sigma, 2) -
            torch.log(1e-8 + torch.pow(sigma, 2)) - 1
        ) / (batchsz*28*28)

        return x_hat, kld

main.py

import torch
from torch.utils.data import DataLoader
from torch import nn, optim
from torchvision import transforms, datasets
from ae import AE
from vae import VAE
import visdom


def main():
    mnist_train = datasets.MNIST('mnist', True, transform=transforms.Compose([
        transforms.ToTensor()
    ]), download=True)
    mnist_train = DataLoader(mnist_train, batch_size=32, shuffle=True)

    mnist_test = datasets.MNIST('mnist', False, transform=transforms.Compose([
        transforms.ToTensor()
    ]), download=True)
    mnist_test = DataLoader(mnist_test, batch_size=32, shuffle=True)

    x, _ = iter(mnist_train).next()  # 无监督
    print('x:', x.shape)

    device = torch.device('cuda')
    # model = AE().to(device)
    model = VAE().to(device)
    criteon = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=1e-3)
    print(model)

    viz = visdom.Visdom()

    for epoch in range(1000):

        for batchidx, (x, _) in enumerate(mnist_train):
            # [b, 1, 28, 28]
            x = x.to(device)
            x_hat, kld = model(x)
            loss = criteon(x_hat, x)  # (after, before)
            if kld is not None:
                elbo = - loss - 1.0 * kld
                loss = - elbo

            # backprop
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        print(epoch, 'loss:', loss.item(), 'kld:', kld.item())

        x, _ = iter(mnist_test).next()
        x = x.to(device)
        with torch.no_grad():
            x_hat, kld = model(x)
        viz.images(x, nrow=8, win='x', opts=dict(title='x'))
        viz.images(x_hat, nrow=8, win='x_hat', opts=dict(title='x_hat'))


if __name__ == '__main__':
    main()

运行结果:
x: torch.Size([32, 1, 28, 28])
VAE(
(encoder): Sequential(
(0): Linear(in_features=784, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=64, bias=True)
(3): ReLU()
(4): Linear(in_features=64, out_features=20, bias=True)
(5): ReLU()
)
(decoder): Sequential(
(0): Linear(in_features=10, out_features=64, bias=True)
(1): ReLU()
(2): Linear(in_features=64, out_features=256, bias=True)
(3): ReLU()
(4): Linear(in_features=256, out_features=784, bias=True)
(5): Sigmoid()
)
(criteon): MSELoss()
)
Setting up a new session…
0 loss: 0.057902127504348755 kld: 0.0032653608359396458
1 loss: 0.05438760295510292 kld: 0.006918944418430328
2 loss: 0.05003824457526207 kld: 0.007306820712983608
3 loss: 0.05152011662721634 kld: 0.007637156639248133
4 loss: 0.04850537329912186 kld: 0.007779828272759914
5 loss: 0.04862204194068909 kld: 0.008065729402005672
6 loss: 0.04306092858314514 kld: 0.008468987420201302
7 loss: 0.046419210731983185 kld: 0.008360464125871658
8 loss: 0.04904603213071823 kld: 0.008941041305661201
9 loss: 0.04391225054860115 kld: 0.007968575693666935
10 loss: 0.04732757806777954 kld: 0.007267618551850319

epoch 5 X:
在这里插入图片描述
epoch 5 X_hat:
在这里插入图片描述
epoch 10 X:
在这里插入图片描述
epoch 10 X_hat:
在这里插入图片描述

十四、GAN

数据的分布经过降维后可能存在一定的规律;如MNIST数据集
在这里插入图片描述
注:生成器负责学习生成新的图像;判别器负责将生成图像与训练集进行比对,判断是真或假
loss function:
在这里插入图片描述
注:D表示判别器;G表示生成器

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值