基于tensorflow2的图像与信号的去噪自编码器
1. 简介
是一种用于数据去噪无监督的反馈神经网络。一个自动编码器一般由两部分构成:编码器和解码器。编码器用于将输入向量转化为隐变量表示,它通常是一个非线性的仿射变换。自动解码器是将编码器获得的隐变量映射到输入空间的一个特征向量,这个过程还是一个仿射变换,其后可选择的跟随一个非线性变换。转换的目的是获得一个初始输入向量重新表示,保证解码器获得的特征可以很高概率的生成输入特征。而对于降噪自动编码器,它是一种具有降噪功能的特征提取器,目的是将一个包含噪声的输入数据转化为一个干净的数据输出[1]。
2. 图像去噪自编码器mnist测试
2.1 基于tensorflow2的模型训练
- 模型训练
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D, ZeroPadding2D
from keras.models import Model
from keras.callbacks import TensorBoard
from keras.datasets import mnist
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1)) # adapt this if using `channels_first` image data format
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1)) # adapt this if using `channels_first` image data format
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape)
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
def train_model():
input_img = Input(shape=(28, 28, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same', name='encoder')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(x_train_noisy, x_train,
epochs=20,
batch_size=128,
shuffle=True,
validation_data=(x_test_noisy, x_test),
callbacks=[TensorBoard(log_dir='/tmp/tb', histogram_freq=0, write_graph=False)])
autoencoder.save('autoencoder.h5')
train_model()
训练过程,由于是轻薄本,就不完整训练了
Epoch 1/20
469/469 [==============================] - 114s 241ms/step - loss: 0.6914 - val_loss: 0.6907
Epoch 2/20
469/469 [==============================] - 66s 140ms/step - loss: 0.6896 - val_loss: 0.6883
Epoch 3/20
后面在服务器上补的测试,相比于cpu,gpu上训练速度还是相当快的。
2. 可视化测试
2.2 基于tensorflow2的模型测试
没安装opencv可以安装一下
pip install opencv-python
测试代码如下:
import numpy as np
from tensorflow.keras.models import Model
from tensorflow.keras.datasets import mnist
import cv2
from tensorflow.keras.models import load_model
from sklearn.metrics import label_ranking_average_precision_score
import time
print('Loading mnist dataset')
t0 = time.time()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1)) # adapt this if using `channels_first` image data format
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1)) # adapt this if using `channels_first` image data format
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape)
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
t1 = time.time()
print('mnist dataset loaded in: ', t1-t0)
print('Loading model :')
t0 = time.time()
# Load previously trained autoencoder
autoencoder = load_model(r'...\TF2_autoencoder\autoencoder_epoch_10_loss_0.4408784508705139_val_loss_0.4405524432659149.h5')
t1 = time.time()
print('Model loaded in: ', t1-t0)
def plot_denoised_images():
denoised_images = autoencoder.predict(x_test_noisy.reshape(x_test_noisy.shape[0], x_test_noisy.shape[1], x_test_noisy.shape[2], 1))
test_img = x_test_noisy[0]
resized_test_img = cv2.resize(test_img, (280, 280))
cv2.imshow('input', resized_test_img)
cv2.waitKey(0)
output = denoised_images[0]
resized_output = cv2.resize(output, (280, 280))
cv2.imshow('output', resized_output)
cv2.waitKey(0)
10 epochs还是效果不行
20 epochs,效果上似乎稍微好一点了
30 epochs
50 epochs, 看来真的不够
100 epoch
感觉在图像上效果也一般,收敛缓慢,并且文章久远,就不再测试了。
参考资料
[1] https://www.jiqizhixin.com/graph/technologies/b9c4f5ac-15b2-42aa-a261-75158a8a8be7
[2] Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11(Dec): 3371-3408.
[3] https://blog.csdn.net/Dr_maker/article/details/121418914
[4] https://zhuanlan.zhihu.com/p/133207206
[5] https://github.com/bojone/vae