去噪自编码器 (Denoising Autoencoders, DAE)

最新推荐文章于 2025-02-24 10:15:44 发布

酒酿小圆子～

最新推荐文章于 2025-02-24 10:15:44 发布

阅读量2.9k

点赞数 9

分类专栏：机器学习 & 深度学习文章标签： php 开发语言

本文链接：https://blog.csdn.net/u012856866/article/details/143859002

版权

机器学习 & 深度学习专栏收录该内容

93 篇文章

订阅专栏

一、DAE原理

1.1 基本原理

去噪自编码器（Denoising Autoencoder, DAE）是一种自编码器（Autoencoder）的变体，旨在从被污染的输入中学习如何恢复原始输入。这种网络的主要目标是学习输入数据的更高层次特征，而不是依赖于细节。

1.2 主要特征

输入带噪声：DAE 在训练过程中接收的输入数据通常包含噪声，例如对图像的随机扰动或降噪处理。这种噪声可以是随机的（如高斯噪声）或特定模式的（如图像模糊）。
输出与原始输入对比：DAE 的输出与无噪声的原始输入进行对比，以计算损失。这使得网络在训练时专注于恢复整体结构和模式，而不是细节。
特征学习：通过这种方式，DAE 鼓励网络学习更具代表性的特征，增强了模型对噪声的鲁棒性。

1.3 应用领域

图像去噪：可以用于去除图像噪声，提高图像质量。
异常检测：在数据噪声较大的情况下，可以帮助识别异常点。
数据增强：通过生成去噪后的样本，帮助提升模型的泛化能力。

二、DAE在图像去噪中的应用

2.1 图像去噪应用

图像去噪场景示例：
当我们在夜晚拍照时，或者其他黑暗环境时，我们的照片总是被大量的噪点所充斥，严重影响了图像质量，而 DAE 的目的就是用来去除这些图像中的噪声。

为了更好的讲解 DAE，使用简单的 MNIST 数据集进行演示。如下图所示，显示了三组 MNIST 数字：

每组的顶行是原始图像 (Original Images)；
中间的行显示 DAE 的输入 (Noised Images)，这些输入是被噪声破坏的原始图像，当噪声过多时，我们将很难读懂被破坏的数字；
最后一行显示DAE的输出 (Denoised Images)。

在这里插入图片描述

接下来就让我们实际构建一个 DAE，以消除图像中的噪声。
在这里插入图片描述

根据 DAE 的介绍，可以将输入定义为：
其中 $x_{orig}$ 表示被噪声 noise 破坏的原始MNIST图像，编码器的目的是学习潜矢量 $z$ 。
DAE的损失函数表示为：
其中，m是输出的维度，例如在MNIST数据集中， $m = w i d t h * h e i g h t * c hann e l s = 28 * 28 * 1 = 784$ 。 $x_{orig_{i}}$ 和 $x_{i}$ 分别是 $x_{orig}$ 和 $x$ 。

2.2 代码实现

为了实现DAE，首先需要构造训练数据集，输入数据是添加噪声的 MNIST 数字，训练输出数据是原始的干净 MNIST 数字。添加的噪声需要满足高斯分布，均值 $u = 0.5$ ，标准差 $s i g ma = 0.5$ 。由于添加随机噪声可能会产生小于0或大于1的无效像素值，因此需要将像素值裁剪为[0.0，1.0]范围内。

import numpy as np
import tensorflow as tf
from tensorflow import keras
from matplotlib import pyplot as plt
from PIL import Image

# 数据加载
(x_train,_),(x_test,_) = keras.datasets.mnist.load_data()

# 数据预处理
image_size = x_train.shape[1]
x_train = np.reshape(x_train,[-1,image_size,image_size,1])
x_test = np.reshape(x_test,[-1,image_size,image_size,1])
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# 产生高斯分布的噪声
noise = np.random.normal(loc=0.5,scale=0.5,size=x_train.shape)
x_train_noisy = x_train + noise
noise = np.random.normal(loc=0.5,scale=0.5,size=x_test.shape)
x_test_noisy = x_test + noise

# 将像素值裁剪为[0.0，1.0]范围内
x_train_noisy = np.clip(x_train_noisy,0.0,1.0)
x_test_noisy = np.clip(x_test_noisy,0.0,1.0)

模型构建与模型训练：

# 超参数
input_shape = (image_size,image_size,1)
batch_size = 32
kernel_size = 3
latent_dim = 16
layer_filters = [32,64]

"""
模型
"""
#编码器
inputs = keras.layers.Input(shape=input_shape,name='encoder_input')
x = inputs
for filters in layer_filters:
    x = keras.layers.Conv2D(filters=filters,
            kernel_size=kernel_size,
            strides=2,
            activation='relu',
            padding='same')(x)
shape = keras.backend.int_shape(x)

x = keras.layers.Flatten()(x)
latent = keras.layers.Dense(latent_dim,name='latent_vector')(x)
encoder = keras.Model(inputs,latent,name='encoder')
encoder.summary()

# 解码器
latent_inputs = keras.layers.Input(shape=(latent_dim,),name='decoder_input')
x = keras.layers.Dense(shape[1]*shape[2]*shape[3])(latent_inputs)
x = keras.layers.Reshape((shape[1],shape[2],shape[3]))(x)
for filters in layer_filters[::-1]:
    x = keras.layers.Conv2DTranspose(filters=filters,
            kernel_size=kernel_size,
            strides=2,
            padding='same',
            activation='relu')(x)
outputs = keras.layers.Conv2DTranspose(filters=1,
        kernel_size=kernel_size,
        padding='same',
        activation='sigmoid',
        name='decoder_output')(x)
decoder = keras.Model(latent_inputs,outputs,name='decoder')
decoder.summary

autoencoder = keras.Model(inputs,decoder(encoder(inputs)),name='autoencoder')
autoencoder.summary()

# 模型编译与训练
autoencoder.compile(loss='mse',optimizer='adam')
autoencoder.fit(x_train_noisy,
        x_train,validation_data=(x_test_noisy,x_test),
        epochs=10,
        batch_size=batch_size)

# 模型测试
x_decoded = autoencoder.predict(x_test_noisy)

rows,cols = 3,9
num = rows * cols
imgs = np.concatenate([x_test[:num],x_test_noisy[:num],x_decoded[:num]])
imgs = imgs.reshape((rows * 3, cols, image_size, image_size))
imgs = np.vstack(np.split(imgs,rows,axis=1))
imgs = imgs.reshape((rows * 3,-1,image_size,image_size))
imgs = np.vstack([np.hstack(i) for i in imgs])
imgs = (imgs * 255).astype(np.uint8)
plt.figure()
plt.axis('off')
plt.imshow(imgs,interpolation='none',cmap='gray')
plt.show()