【学习笔记】卷积神经网络的过拟合

最新推荐文章于 2024-03-14 21:39:27 发布

Canon__

最新推荐文章于 2024-03-14 21:39:27 发布

阅读量3k

点赞数

本文链接：https://blog.csdn.net/Canon__/article/details/85023653

版权

本文探讨了在卷积神经网络中遇到的过拟合问题，并通过扩充数据集和应用丢弃正则化（Dropout）来尝试降低过拟合的影响。介绍了Dropout的工作原理及其在全连接层的应用，同时分享了数据增强的具体操作，包括旋转和投影。实验结果显示，虽然Dropout提高了验证集准确率，但过拟合问题仍未完全解决，建议结合早停法或更大规模的数据集。

摘要由CSDN通过智能技术生成

在上一部分我们发现严重的过拟合导致了我们的验证集准确率很低。

这次我们将通过扩充数据集以及丢弃正则化(dropout)的方法来降低过拟合带来的影响。

我们先来看一下dropout函数:

import tensorflow as tf
import numpy as np
x = np.array([1,2,3,4,5]).astype('float32')
sess = tf.Session()
print(sess.run(tf.nn.dropout(x, 0.6)))


[0.        3.3333333 0.        6.6666665 8.333333 ]

通过观察，我们很容易就发现了保留下来的数字都被除以了 keep_prob(保留率)中的数字。

丢弃正则化一般应用于全连接层，主要因为全连接层神经元数量多。丢弃正则化可以有效的减小我们的模型。

换个角度思考，卷积部分好比我们的眼睛，全连接层好比我们的大脑。大脑运作的时候不会激活全部神经元，丢弃正则化也模仿了大脑的运作。

上一次的过拟合主要因为我们的数据集过小。那么我们先扩充数据集:

基于上一篇cnn文章我们已经将原始数据resize为150*150，而且创建了相应的文件夹。这里我不再贴出之前的代码。

# coding=utf-8
import os
import cv2
import numpy as np

base_dir = './dataset/cats_and_dogs_filtered'
train_dir = os.path.join(base_dir, 'train')
train_cats_dir = os.path.join(train_dir, 'cats/')
train_dogs_dir = os.path.join(train_dir, 'dogs/')
train_cat_fnames = os.listdir(train_cats_dir)
train_dog_fnames = os.listdir(train_dogs_dir)
target_cats = os.path.join(train_dir, 'resize_cats/')
target_dogs = os.path.join(train_dir, 'resize_dogs/')


def tPerspectiveTransform(img, inputs_shape, transform_shape, outputs_shape):
    pts1 = np.float32(inputs_shape)
    pts2 = np.float32(transform_shape)
    M = cv2.getPerspectiveTransform(pts1, pts2)
    dst = cv2.warpPerspective(img, M, outputs_shape)
    return dst


for i in train_cat_fnames:
    img = cv2.imread(train_cats_dir + i, 1)
    rows, cols = img.shape[0:2]
    rows, cols = round(rows), round(cols)
    dst_1 = tPerspectiveTransform(img, [[rows/6, cols/6], [rows*5/6, cols/6],
                                        [rows/6, cols*5/6], [rows*5/6, cols*5/6]],
                                  [[0, 0], [150, 0], [0, 150], [150, 150]], (150, 150))
    dst_2 = tPerspectiveTransform(img, [[rows/4, cols/4], [rows*3/4, cols/4],
                                        [rows/4, cols*3/4], [rows*3/4, cols*3/4]],
                                  [[0, 0], [150, 0], [0, 150], [150, 150]], (150, 150))
    cv2.imwrite(target_cats + 'pts_1_1' + i, dst_1)
    cv2.imwrite(target_cats + 'pts_1_2' + i, dst_2)
    M_1 = cv2.getRotationMatrix2D((cols/2, rows/2), 90, 1)
    M_2 = cv2.getRotationMatrix2D((cols/2, rows/2), 270, 1)
    rotation_1 = cv2.warpAffine(img, M_1, (cols, rows))
    resize_ro1 = cv2.resize(rotation_1, dsize=(150, 150))
    rotation_2 = cv2.warpAffine(img, M_2, (cols, rows))
    resize_ro2 = cv2.resize(rotation_2, dsize=(150, 150))
    cv2.imwrite(target_cats + 'rotation_1' + i, resize_ro1)
    cv2.imwrite(target_cats + 'rotation_2' + i, resize_ro2)


for i in