卷积神经网络CNN--PyTorch实现

最新推荐文章于 2023-04-18 11:41:55 发布

JingleLee123

最新推荐文章于 2023-04-18 11:41:55 发布

阅读量1.8k

点赞数 6

分类专栏：深度学习

本文链接：https://blog.csdn.net/qq_38195197/article/details/104146887

版权

深度学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

本文纯属个人这几天学习后的看法，如有不实，还望指正。

（一）相关理论浅谈

在我看来，卷积神经网络对图像处理的过程就是：①先将图像读入程序，得到图像的每个像素点的每个颜色通道的值。本篇博客是使用MNIST数据集，是28 $\times$ 28的图像，并且使用单颜色通道。然后将读入的图像数据转换成一致尺寸的tensor数据结构。②使用不同的卷积核对图像进行卷积计算，需要设置使图像卷积后大小不变的padding，filter的个数，卷积过程中滑动窗口大小stride，输入通道数，输出通道数。③使用非线性激励函数（通常使用ReLu函数），初步提取特征。④使用池化层提取主要特征，池化过程通常是求最大值或者平均值⑤使用全连接层将各部分特征汇总。⑤计算损失值，为防止过拟合，给损失函数加上正则惩罚项。通过反向传播，调整参数值。反向传播的过程也就是求偏导的过程，求出每个参数对最后结果的影响，若偏导数为正，在结果值偏大时，该参数便要调小。

（二）MNIST数据集

直接通过代码从网上下载，可能因为网速等原因难以下载下来，可以先去网上下载下来，然后使用如下方法：https://blog.csdn.net/AugustMe/article/details/90638342
或者使用我下载后的数据，放在代码所在的文件夹中。

（三）代码实现

import os    #调用系统命令
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt

#Hyper Parameters 超参
num_epochs = 5
batch_size = 100
learning_rate = 0.001

DOWNLOAD_MNIST = False
if not(os.path.exists('./mnist/')) or not os.listdir('./mnist/'):
    # not mnist dir or mnist is empyt dir
    DOWNLOAD_MNIST = True
    
#MNIST Dataset
train_dataset =  dsets.MNIST(root='./mnist/', train=True, transform=transforms.ToTensor(), download=DOWNLOAD_MNIST)
test_dataset = dsets.MNIST(root='./mnist/', train=False, transform=transforms.ToTensor())

#Data Loader(Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=1, shuffle=False)

#CNN Model(2 conv layer)
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=5, padding=2),   #stride=1时， k=2p+1
            nn.BatchNorm2d(16),             #对这16个结果进行规范处理,卷积网络中(防止梯度消失或爆炸)，设置的参数就是卷积的输出通道数
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(16, 32, kernel_size=5, padding=2),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2))
        self.fc = nn.Linear(7*7*32, 10)
        
    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out
cnn = CNN()
cnn.cuda()

#Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cnn.parameters(), lr=learning_rate)

# Train the Model
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        images = Variable(images).cuda()
        labels = Variable(labels).cuda()
        
        #Forward + Backward + Optimize
        optimizer.zero_grad()    #清空上一次梯度
        outputs = cnn(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        if(i+1) % 100 == 0:
            print('Epoch [%d/%d], Iter[%d/%d] Loss: %.4f' %(epoch+1, num_epochs, i+1, len(train_dataset)/batch_size, loss.item()))
        
# Test the Model 用数据集中的数据测试模型
cnn.eval()  #Change model to 'eval' mode (BN uses moving mean/var).
correct = 0
total = 0
for images, labels in test_loader:
    images = Variable(images).cuda()
    outputs = cnn(images)
    _, predicted = torch.max(outputs.data, 1)  #按照维度取最大值，返回每一行中最大的元素，且返回索引
    total += labels.size(0)         #labels.size(0) = 100 = batch_size
    correct += (predicted.cpu() == labels).sum()  #计算每次批量处理后，100个测试图像中有多少个预测正确，求和加入correct
       
print('Test accuracy of the model on the 10000 test images: %d %%' %(100 * correct/total))

#用一张自己手写的图片测试模型
img_path="66.png"
##PIL
img = Image.open(img_path).convert('L') # 读取图像
transform1 = transforms.Compose([
    transforms.Scale(28),
    transforms.CenterCrop((28, 28)),
    transforms.ToTensor(), # range [0, 255] -> [0.0,1.0]
    ]
)
img2 = transform1(img) # 归一化到 [0.0,1.0]
img2 = img2.unsqueeze(0) #增加一维，输出的img格式为[1,C,H,W]
image = Variable(img2).cuda()

# print(image.shape)
# mode = transforms.ToPILImage()(img2)
# plt.imshow(mode)
# plt.show()

#测试图片
outputs = cnn(image)
print(outputs.data)
_, predicted = torch.max(outputs.data, 1) 
print("Predicted %d" %predicted)
      
# Save the Trained Model
torch.save(cnn.state_dict(), 'cnn.pkl')

测试自己的图片

在这里插入图片描述

输出结果

Epoch [1/5], Iter[100/600] Loss: 0.1901
Epoch [1/5], Iter[200/600] Loss: 0.1366
Epoch [1/5], Iter[300/600] Loss: 0.0650
Epoch [1/5], Iter[400/600] Loss: 0.0861
Epoch [1/5], Iter[500/600] Loss: 0.1619
Epoch [1/5], Iter[600/600] Loss: 0.1493
Epoch [2/5], Iter[100/600] Loss: 0.0475
Epoch [2/5], Iter[200/600] Loss: 0.1708
Epoch [2/5], Iter[300/600] Loss: 0.0233
Epoch [2/5], Iter[400/600] Loss: 0.0138
Epoch [2/5], Iter[500/600] Loss: 0.1100
Epoch [2/5], Iter[600/600] Loss: 0.0075
Epoch [3/5], Iter[100/600] Loss: 0.0124
Epoch [3/5], Iter[200/600] Loss: 0.0068
Epoch [3/5], Iter[300/600] Loss: 0.0298
Epoch [3/5], Iter[400/600] Loss: 0.0111
Epoch [3/5], Iter[500/600] Loss: 0.0751
Epoch [3/5], Iter[600/600] Loss: 0.0155
Epoch [4/5], Iter[100/600] Loss: 0.0971
Epoch [4/5], Iter[200/600] Loss: 0.0173
Epoch [4/5], Iter[300/600] Loss: 0.0099
Epoch [4/5], Iter[400/600] Loss: 0.0049
Epoch [4/5], Iter[500/600] Loss: 0.0199
Epoch [4/5], Iter[600/600] Loss: 0.0042
Epoch [5/5], Iter[100/600] Loss: 0.0482
Epoch [5/5], Iter[200/600] Loss: 0.0423
Epoch [5/5], Iter[300/600] Loss: 0.0208
Epoch [5/5], Iter[400/600] Loss: 0.0360
Epoch [5/5], Iter[500/600] Loss: 0.0153
Epoch [5/5], Iter[600/600] Loss: 0.0326
Test accuracy of the model on the 10000 test images: 98 %
tensor([[-0.1304, -0.4410, -0.8985, -0.6545, -1.9931, 0.2359, 1.9494, -2.1930,
-1.5755, -1.3205]], device=‘cuda:0’)
Predicted 6

参考链接

Pytorch实现简单CNN模型
 Python图像处理库PIL中图像格式转换（一）
Pytorch模型保存与加载，并在加载的模型基础上继续训练
 pytorch模型保存格式

JingleLee123

关注

6
点赞
踩
39

收藏

觉得还不错? 一键收藏
0
评论
卷积神经网络CNN--PyTorch实现

本文纯属个人这几天学习后的看法，如有不实，还望指正。（一）相关理论浅谈在我看来，卷积神经网络对图像处理的过程就是：①先将图像读入程序，得到图像的每个像素点的每个颜色通道的值。本篇博客是使用MNIST数据集，是28 ×\times× 28的图像，并且使用单颜色通道。然后将读入的图像数据转换成一致尺寸的tensor数据结构。②使用不同的卷积核对图像进行卷积计算，需要设置使图像卷积后大小不变...
复制链接

扫一扫