【Keras-DCGAN】MNIST / CIFAR-10

原文
在这里插入图片描述
本博客是 One Day One GAN [DAY 2] 的 learning notes!GAN 是用 CNN 搭建的!!!

DCGAN《Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks》
把 GAN 中的 MLP 换成了 CNN,应用在人脸生成中!



1 GAN 的介绍

有关 GAN 的介绍可以参考 【Keras-MLP-GAN】MNIST,更多有关 GAN 的小实验,可以参考 【Programming】 中的 3.5 小节!

2 DCGAN for MNIST

在这里插入图片描述

2.1 导入必要的库

from keras.datasets import mnist
from keras.layers import Input, Dense, Reshape, Flatten, Dropout
from keras.layers import BatchNormalization, Activation, ZeroPadding2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import Adam

import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
import sys

import numpy as np

2.2 搭建 generator

输入 100 维的 noisy,输出 (28,28,1)的图片

100 → 128*7*7 reshape 成(7,7,128)→上采样(14,14,128)→ conv(14,14,128)→上采样(28,28,128)→ conv(28,28,64)→ conv(28,28,1)

# build_generator(self)
model = Sequential()

model.add(Dense(128 * 7 * 7, activation="relu", input_dim=100)) # 7,7,128
model.add(Reshape((7, 7, 128)))

model.add(UpSampling2D()) # 14,14,128

model.add(Conv2D(128, kernel_size=3, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(UpSampling2D()) # 28,28,128

model.add(Conv2D(64, kernel_size=3, padding="same")) 
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(Conv2D(1, kernel_size=3, padding="same"))
model.add(Activation("tanh"))

model.summary()

noise = Input(shape=(100,))
img = model(noise)        
        
generator = Model(noise, img)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 6272)              633472    
_________________________________________________________________
reshape_1 (Reshape)          (None, 7, 7, 128)         0         
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 14, 14, 128)       147584    
_________________________________________________________________
batch_normalization_1 (Batch (None, 14, 14, 128)       512       
_________________________________________________________________
activation_1 (Activation)    (None, 14, 14, 128)       0         
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 28, 28, 128)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 28, 28, 64)        73792     
_________________________________________________________________
batch_normalization_2 (Batch (None, 28, 28, 64)        256       
_________________________________________________________________
activation_2 (Activation)    (None, 28, 28, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 28, 28, 1)         577       
_________________________________________________________________
activation_3 (Activation)    (None, 28, 28, 1)         0         
=================================================================
Total params: 856,193
Trainable params: 855,809
Non-trainable params: 384
_________________________________________________________________

2.3 搭建 discriminator

输入(28,28,1),输出概率值
(28,28,1)→ conv(14,14,32)→ conv(7,7,64)→ padding (8,8,64)→ conv(4,4,128)→ conv(4,4,256)→ flatten 4*4*256 → 1

# build_discriminator
model = Sequential()

model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=(28,28,1), padding="same")) # 14,14,32
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))  # 7,7,64
model.add(ZeroPadding2D(padding=((0,1),(0,1)))) # 8,8,64
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=3, strides=2, padding="same")) # 4,4,128
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(256, kernel_size=3, strides=1, padding="same")) # 4,4,256
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Flatten()) # 4*4*256
model.add(Dense(1, activation='sigmoid'))

model.summary()

img = Input(shape=(28,28,1)) # 输入 (28,28,1)
validity = model(img) # 输出二分类结果

discriminator = Model(img, validity)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 14, 14, 32)        320       
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 14, 14, 32)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 7, 7, 64)          18496     
_________________________________________________________________
zero_padding2d_1 (ZeroPaddin (None, 8, 8, 64)          0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 8, 8, 64)          256       
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 8, 8, 64)          0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 4, 4, 128)         73856     
_________________________________________________________________
batch_normalization_4 (Batch (None, 4, 4, 128)         512       
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 4, 4, 128)         0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 4, 4, 128)         0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 4, 4, 256)         295168    
_________________________________________________________________
batch_normalization_5 (Batch (None, 4, 4, 256)         1024      
_________________________________________________________________
leaky_re_lu_4 (LeakyReLU)    (None, 4, 4, 256)         0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 4, 4, 256)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 4096)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 4097      
=================================================================
Total params: 393,729
Trainable params: 392,833
Non-trainable params: 896
_________________________________________________________________

2.4 compile 模型,对学习过程进行配置

optimizer = Adam(0.0002, 0.5)

# discriminator
discriminator.compile(loss='binary_crossentropy',
                      optimizer=optimizer,
                      metrics=['accuracy'])


# The combined model  (stacked generator and discriminator)
z = Input(shape=(100,))
img = generator(z)
validity = discriminator(img)
# For the combined model we will only train the generator
discriminator.trainable = False

# Trains the generator to fool the discriminator
combined = Model(z, validity)
combined.summary()
combined.compile(loss='binary_crossentropy', 
                 optimizer=optimizer)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         (None, 100)               0         
_________________________________________________________________
model_1 (Model)              (None, 28, 28, 1)         856193    
_________________________________________________________________
model_2 (Model)              (None, 1)                 393729    
=================================================================
Total params: 1,249,922
Trainable params: 855,809
Non-trainable params: 394,113
_________________________________________________________________

2.5 保存生成的图片

def sample_images(epoch):
    r, c = 5, 5
    noise = np.random.normal(0, 1, (r * c, 100))
    gen_imgs = generator.predict(noise)

    # Rescale images 0 - 1
    gen_imgs = 0.5 * gen_imgs + 0.5

    fig, axs = plt.subplots(r, c)
    cnt = 0
    for i in range(r):
        for j in range(c):
            axs[i,j].imshow(gen_imgs[cnt, :,:,0], cmap='gray')
            axs[i,j].axis('off')
            cnt += 1
    fig.savefig("images/%d.png" % epoch)
    plt.close()

2.6 训练

batch_size = 32
sample_interval = 50
# Load the dataset
(X_train, _), (_, _) = mnist.load_data() # (60000,28,28)
# Rescale -1 to 1
X_train = X_train / 127.5 - 1.
X_train = np.expand_dims(X_train, axis=3)  # (60000,28,28,1)
# Adversarial ground truths
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))

for epoch in range(4001):
    # ---------------------
    #  Train Discriminator
    # ---------------------

    # Select a random batch of images
    idx = np.random.randint(0, X_train.shape[0], batch_size) # 0-60000 中随机抽  
    imgs = X_train[idx]
    noise = np.random.normal(0, 1, (batch_size, 100))# 生成标准的高斯分布噪声

    # Generate a batch of new images
    gen_imgs = generator.predict(noise)

    # Train the discriminator
    d_loss_real = discriminator.train_on_batch(imgs, valid) #真实数据对应标签1
    d_loss_fake = discriminator.train_on_batch(gen_imgs, fake) #生成的数据对应标签0
    d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

    # ---------------------
    #  Train Generator
    # ---------------------
    noise = np.random.normal(0, 1, (batch_size, 100))

    # Train the generator (to have the discriminator label samples as valid)
    g_loss = combined.train_on_batch(noise, valid)

    # Plot the progress
    if epoch % 50==0:
        print ("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" % (epoch, d_loss[0], 100*d_loss[1], g_loss))

    # If at save interval => save generated image samples
    if epoch % sample_interval == 0:
        sample_images(epoch)
        

每隔 50 iteration 打印一次结果,保存一次图片(代码中的 epoch 理解为 iteration)
output

0 [D loss: 1.110515, acc.: 34.38%] [G loss: 0.947779]
50 [D loss: 0.857457, acc.: 56.25%] [G loss: 1.294673]
100 [D loss: 0.849929, acc.: 48.44%] [G loss: 1.071479]
………………
3900 [D loss: 0.715688, acc.: 56.25%] [G loss: 1.014293]
3950 [D loss: 0.772328, acc.: 45.31%] [G loss: 1.014349]
4000 [D loss: 0.596769, acc.: 71.88%] [G loss: 1.198742]

2.7 结果展示

iteration 0
在这里插入图片描述

iteration 50
在这里插入图片描述

iteration 100
在这里插入图片描述

iteration 3900
在这里插入图片描述

iteration 3950
在这里插入图片描述

iteration 4000
在这里插入图片描述

3 DCGAN for CIFAR-10

哈哈,用这套代码改改输入玩玩 cifar-10 看看,MNIST 图片是(28,28,1),cifar 则是(32,32,3)

3.1 导入必要的库

这里要导入 cifar10

from keras.datasets import cifar10
from keras.layers import Input, Dense, Reshape, Flatten, Dropout
from keras.layers import BatchNormalization, Activation, ZeroPadding2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import Adam

import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
import sys

import numpy as np

3.2 搭建 generator

注意 feature map 的 size 即可

# build_generator(self)
model = Sequential()

model.add(Dense(128 * 8 * 8, activation="relu", input_dim=100)) # 8,8,128
model.add(Reshape((8, 8, 128)))

model.add(UpSampling2D()) # 16,16,128

model.add(Conv2D(128, kernel_size=3, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(UpSampling2D()) # 32,32,128

model.add(Conv2D(64, kernel_size=3, padding="same")) 
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(Conv2D(3, kernel_size=3, padding="same"))
model.add(Activation("tanh"))

model.summary()

noise = Input(shape=(100,))
img = model(noise)        
        
generator = Model(noise, img)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 8192)              827392    
_________________________________________________________________
reshape_1 (Reshape)          (None, 8, 8, 128)         0         
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 16, 16, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 128)       147584    
_________________________________________________________________
batch_normalization_1 (Batch (None, 16, 16, 128)       512       
_________________________________________________________________
activation_1 (Activation)    (None, 16, 16, 128)       0         
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 32, 32, 128)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 32, 32, 64)        73792     
_________________________________________________________________
batch_normalization_2 (Batch (None, 32, 32, 64)        256       
_________________________________________________________________
activation_2 (Activation)    (None, 32, 32, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 32, 32, 3)         1731      
_________________________________________________________________
activation_3 (Activation)    (None, 32, 32, 3)         0         
=================================================================
Total params: 1,051,267
Trainable params: 1,050,883
Non-trainable params: 384
_________________________________________________________________

3.3 搭建 discriminator

注意输入的改变即可

# build_discriminator
model = Sequential()

model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=(32,32,3), padding="same")) # 16,16,32
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))  # 8,8,64
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=3, strides=2, padding="same")) # 4,4,128
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(256, kernel_size=3, strides=1, padding="same")) # 4,4,256
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Flatten()) # 4*4*256
model.add(Dense(1, activation='sigmoid'))

model.summary()

img = Input(shape=(32,32,3)) # 输入 (28,28,1)
validity = model(img) # 输出二分类结果

discriminator = Model(img, validity)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 16, 16, 32)        896       
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 16, 16, 32)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 8, 8, 64)          18496     
_________________________________________________________________
batch_normalization_3 (Batch (None, 8, 8, 64)          256       
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 8, 8, 64)          0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 4, 4, 128)         73856     
_________________________________________________________________
batch_normalization_4 (Batch (None, 4, 4, 128)         512       
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 4, 4, 128)         0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 4, 4, 128)         0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 4, 4, 256)         295168    
_________________________________________________________________
batch_normalization_5 (Batch (None, 4, 4, 256)         1024      
_________________________________________________________________
leaky_re_lu_4 (LeakyReLU)    (None, 4, 4, 256)         0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 4, 4, 256)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 4096)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 4097      
=================================================================
Total params: 394,305
Trainable params: 393,409
Non-trainable params: 896
_________________________________________________________________

3.4 compile 模型,对学习过程进行配置

optimizer = Adam(0.0002, 0.5)

# discriminator
discriminator.compile(loss='binary_crossentropy',
                      optimizer=optimizer,
                      metrics=['accuracy'])


# The combined model  (stacked generator and discriminator)
z = Input(shape=(100,))
img = generator(z)
validity = discriminator(img)
# For the combined model we will only train the generator
discriminator.trainable = False

# Trains the generator to fool the discriminator
combined = Model(z, validity)
combined.summary()
combined.compile(loss='binary_crossentropy', 
                 optimizer=optimizer)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         (None, 100)               0         
_________________________________________________________________
model_1 (Model)              (None, 32, 32, 3)         1051267   
_________________________________________________________________
model_2 (Model)              (None, 1)                 394305    
=================================================================
Total params: 1,445,572
Trainable params: 1,050,883
Non-trainable params: 394,689
_________________________________________________________________

3.5 保存生成的图片

这里要注意修改一下,因为 MNIST 产生的是黑白,CIFAR-10 则产生的是彩色

def sample_images(epoch):
    r, c = 5, 5
    noise = np.random.normal(0, 1, (r * c, 100))
    gen_imgs = generator.predict(noise)

    # Rescale images 0 - 1
    gen_imgs = (0.5 * gen_imgs + 0.5)

    fig, axs = plt.subplots(r, c)
    cnt = 0
    for i in range(r):
        for j in range(c):
            axs[i,j].imshow(gen_imgs[cnt, :,:,:])# 注意这里的改变
            axs[i,j].axis('off')
            cnt += 1
    fig.savefig("images/%d.png" % epoch)
    plt.close()

3.6 训练

训练 50001 个 iteration,每 500 个 iteration 输出一次结果

batch_size = 32
sample_interval = 500
# Load the dataset
(X_train,_),(_,_)=cifar10.load_data()#(50000, 32, 32, 3)
# Rescale -1 to 1
X_train = X_train / 127.5 - 1.
# Adversarial ground truths
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))

for epoch in range(50001):
    # ---------------------
    #  Train Discriminator
    # ---------------------

    # Select a random batch of images
    idx = np.random.randint(0, X_train.shape[0], batch_size) # 0-60000 中随机抽  
    imgs = X_train[idx]
    noise = np.random.normal(0, 1, (batch_size, 100))# 生成标准的高斯分布噪声

    # Generate a batch of new images
    gen_imgs = generator.predict(noise)

    # Train the discriminator
    d_loss_real = discriminator.train_on_batch(imgs, valid) #真实数据对应标签1
    d_loss_fake = discriminator.train_on_batch(gen_imgs, fake) #生成的数据对应标签0
    d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

    # ---------------------
    #  Train Generator
    # ---------------------
    noise = np.random.normal(0, 1, (batch_size, 100))

    # Train the generator (to have the discriminator label samples as valid)
    g_loss = combined.train_on_batch(noise, valid)

    # Plot the progress
    if epoch % 500==0:
        print ("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" % (epoch, d_loss[0], 100*d_loss[1], g_loss))

    # If at save interval => save generated image samples
    if epoch % sample_interval == 0:
        sample_images(epoch)
        

output

0 [D loss: 1.069107, acc.: 43.75%] [G loss: 0.424518]
500 [D loss: 0.858720, acc.: 45.31%] [G loss: 1.105853]
1000 [D loss: 0.807555, acc.: 46.88%] [G loss: 0.961553]
……
49000 [D loss: 0.835431, acc.: 35.94%] [G loss: 0.684724]
49500 [D loss: 0.745767, acc.: 48.44%] [G loss: 0.881849]
50000 [D loss: 0.651077, acc.: 57.81%] [G loss: 0.958367]

3.7 结果展示

iteration 0
在这里插入图片描述
iteration 500
在这里插入图片描述

iteration 1000
在这里插入图片描述

iteration 49000
在这里插入图片描述

iteration 49500
在这里插入图片描述

iteration 50000在这里插入图片描述

### 回答1: dcgan-pytorch是一种基于深度学习的生成对抗网络,并使用PyTorch实现。如果您想要下载官方源码,可以前往PyTorch官方GitHub页面。在GitHub上搜索“dcgan-pytorch”并进入相应页面,您可以看到源代码和相关文档。通过该代码,您可以了解到dcgan-pytorch的实现细节以及如何使用它来生成新的图像。此外,如果您有其他与PyTorch相关的问题,该GitHub页面也会给您提供更多的参考资源以及相关文档和教程。总的来说,通过下载和查看dcgan-pytorch官方源码,您可以更好地理解深度学习的一些底层原理,进一步提高编程技能和应用能力。 ### 回答2: dcgan-pytorch是一种用于生成对抗网络(GAN)的模型,可以用于生成各种图像和视频,具有广泛的应用前景。该模型的官方源码可以在以下网站上下载: 1. GitHub:dcgan-pytorch的官方开源代码托管在GitHub上,用户可以在该网站上下载源码,并且还可以参与开源社区的贡献。 2. PyTorch官网:PyTorch是一种深度学习框架,dcgan-pytorch是其官方提供的模型之一。在PyTorch官网上,用户可以下载dcgan-pytorch的官方源码,并且可以获取最新的更新和技术支持。 3. AI Hub:AI Hub是一个面向机器学习和深度学习开发者的社区和开发平台,提供了丰富的技术资源和工具。在AI Hub上,用户可以下载dcgan-pytorch的官方源码,并且还可以分享自己的经验和技术博客。 总之,dcgan-pytorch的官方源码可以在各种开发社区和官方网站上下载,用户可以选择适合自己的方式获取,并且可以利用这些资源学习和开发更高效的机器学习和深度学习模型。 ### 回答3: dcgan-pytorch是一种基于PyTorch框架的深度卷积生成对抗网络模型。在GitHub上可以找到它的官方源代码。具体操作步骤如下: 1. 打开GitHub官网,搜索dcgan-pytorch。 2. 进入搜索结果中的dcgan-pytorch仓库页面。 3. 在仓库页面上可以看到源代码和相关文档。 4. 点击“Clone or download”按钮以下载源代码。 使用dcgan-pytorch可以生成高质量的图像,它不仅可以应用于图像生成和转换领域,还可以用于涉及到图像的各种人工智能任务中,例如图像识别、图像分割和目标检测等方面。对于开发者和研究人员来说,下载dcgan-pytorch官方源码可以快速理解模型原理,修改代码来实现自己感兴趣的图片生成任务。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值