pix2pix gan
There are times that we want to to transform an image into another style. Let’s say we have a fine collection of sketches. Our daily work is to color these black and white images.
有时候,我们希望将图像转换为另一种样式。 假设我们有一组草图。 我们的日常工作是为这些黑白图像着色。
It might be interesting if the number of tasks is small, but when it comes to hundreds of sketches a day, hmmm… maybe we need some help. This is where GAN comes to rescue. Generative Adversarial Network, or GAN, is a machine learning framework that aims to generate new data with the same distribution as the one in the training dataset. In this article, we will build a pix2pix GAN that takes an image as input, and later outputs another image.
如果任务数量很少,可能会很有趣,但是当涉及到每天数百个草图时,嗯……也许我们需要一些帮助。 这就是GAN救援的地方。 生成对抗网络(GAN)是一种机器学习框架,旨在生成与训练数据集中的分布相同的新数据。 在本文中,我们将构建一个pix2pix GAN,它将图像作为输入,然后输出另一个图像。
To break things down, we will go through these steps:
为了分解,我们将执行以下步骤:
- Prepare our data 准备我们的数据
- Build the network 建立网络
- Train the network 训练网络
- Test and see the results 测试并查看结果
准备我们的数据 (Prepare our data)
In image transformation, we need to have an original image and its expected transformed result. It is recommended to have more than thousands of this kind of before-after-pairs. (Yes, GAN needs a lot of image 😅) In this post, we will use data from this kaggle dataset.
在图像转换中,我们需要原始图像及其预期的转换结果。 建议拥有成千上万的此类前后配对。 (是的,GAN需要很多图像😅 )在本文中,我们将使用来自kaggle数据集的数据 。
The image pairs can be saved as a merged one like those in our dataset. They can also be separated in two folders, just make sure the order matches later when we process them 😉
图像对可以像我们数据集中的图像对一样保存为合并的图像对。 它们也可以分成两个文件夹,只要稍后处理它们时确保顺序匹配😉
Since the image pairs are merged in a single one, we first need to split them into sketch images and colored pictures:
由于图像对合并为一个图像对,因此我们首先需要将其分为草图图像和彩色图片:
from os import listdir
from numpy import asarray
from numpy import vstack
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import load_img
from numpy import savez_compressed
def load_images(path, size=(256,512)):
src_list = list()
tar_list = list()
for filename in listdir(path):
# load and resize the image
pixels = load_img(path + filename, target_size=size) # images are in PIL formate
# convert to numpy array
pixels = img_to_array(pixels)
# split into colored and sketch. 256 comes from 512/2. The first part is colored while the rest is sketch
color_img, bw_img = pixels[:, :256], pixels[:, 256:]
src_list.append(bw_img)
tar_list.append(color_img)
return [asarray(src_list), asarray(tar_list)]
Having our splitting function, we can process the training dataset with the following code:
有了拆分功能,我们可以使用以下代码处理训练数据集:
path = "data/train/"
# load dataset
[src_images, tar_images] = load_images(path)
print('Loaded: ', src_images.shape, tar_images.shape)
# save as compressed numpy array
filename = 'gan_img_train.npz'
savez_compressed(filename, src_images, ta