速通机器学习（2）- CNN(最基本的)

Sauvignon.

已于 2023-11-16 21:32:53 修改

阅读量72

点赞数

分类专栏：机器学习速通文章标签：机器学习 cnn 人工智能

于 2023-11-16 21:22:51 首次发布

本文链接：https://blog.csdn.net/weixin_47134520/article/details/134436277

版权

机器学习速通专栏收录该内容

4 篇文章 0 订阅

订阅专栏

# CNN 视觉任务：检测任务，分类检索，超分辨率重构，医学检测 (标志识别)，自动驾驶

# 卷积网络各部件的作用：

784 ~ 28 x 28 x1

* 输入层 - 卷积层 - 池化层 - 全连接层 - softmax非线性激活

* 卷积：特征提取 - 获得特征图(1,2,3 色调通道相加) - 训练以获得最佳的 W_ij

* 池化：关键提取 - 获得降采样图(分区取最大元素) - 保留最大响应特征

# 特征图：

28 x 28 x 1 ~ 28 x 28 x N ~ 多次调用filter,堆成n层，每次卷积核size一样

* 多次卷积 ~ 分粒度Step 提取信息

* heighth,width,（channels,features） ~ channels 和上层调用的filter数一致

* 我们对每一层的参数写作：（in_channels x heighth x wigth）

* 滑动步长（卷积）：小步长，能更多滑动，图丰富，细粒度

* 图像步长：1 ；文本步长：词

* 卷积核Size= heighth x width：3x3,5x5

* 边缘填充 padding：补0 ~ 充分利用边缘信息

* 输出规格： // C_h,C_w：卷积的规格，P：补零的规格，Step：扫描步长

H_next = (H_current - C_h + 2P)/Step + 1

W_next = (W_current - C_w + 2P)/Step + 1

P ~ False 或者 padding = ( C_ edge - 1 ) / 2

Pooling：多少pooling就/多少规格

* 卷积参数共享：10filters x 5heighth x 5width x 3 channels ~ 750 features

* Relu映射：特征增强

* 感受野：7 ~ 7*7 C**2 > 3*(3*3) C**2

# 感受野：

7 x 7 ~ 3 x 3x3 ~ 能够共享更多参数，因此待训练的变少了

# 经典的网络架构：

* Alex Net：

* VGG Net：

* ResNet：（深层梯度消失/爆炸严重）

# 简单的CNN设计：

* 调用模块，初始化参数

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torchvision import datasets,transforms

import matplotlib.pyplot as plt
import numpy as np

# 定义超参数
input_size= 28
num_classes = 10
num_epoches = 8

# 图像预处理  
batch_size = 64
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

* 下载数据包，进行数据集处理：

train_dataset = datasets.MNIST(root='../dataset/mnist/',
                               train=True,
                               download=True,
                               transform=transform
                               )

test_dataset = datasets.MNIST(root='../dataset/mnist/',
                              train=False,
                              download=True,
                              transform=transform
                               )

train_loader = DataLoader(dataset=train_dataset,
                          shuffle=True,
                          batch_size=batch_size,
                          # x.size(0)
                          )
test_loader = DataLoader(dataset=test_dataset,
                         shuffle=False,
                         batch_size=batch_size,
                          # x.size(0)
                          )

* 定义简单CNN：

class CNN(torch.nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = torch.nn.Conv2d(in_channels=1, out_channels=10, kernel_size=5) 
        self.conv2 = torch.nn.Conv2d(in_channels=10, out_channels=20, kernel_size=5) 
        # padding 起码要 (kernel_size-1)/2
        # 无padding (28-5)+1 ~ 24 ~ 24/2 = 12 ~ (12-5)+1 ~ 8 ~ 8/2 = 4
        # 无padding忽略特征图边缘，不建议深度使用
        self.pooling = torch.nn.MaxPool2d(2)
       # out_channels x 4 x 4, features
        self.fc = torch.nn.Linear(320, 10) 
       
  
    def forward(self, x):
        batch_size = x.size(0)
        # (1 x 28 x 28)
        x = F.relu(self.pooling(self.conv1(x)))
        # Conv1:in-1,out-10 ~ (10 x 12 x 12)
        x = F.relu(self.pooling(self.conv2(x)))
        # Conv2:in-10,out-20 ~(20 x  4 x  4)
        x = x.view(batch_size, -1)
       # 适应全连接层分类的拉直操作
       # batch_size: 又作为 x.size(0),从上文Loader定义的
       # x 还将 从 下文 enumerate 里面找出
        x = self.fc(x)
        return x

* 训练，以及如何从loader取x：

model = CNN()

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model.to(device)

criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.02, momentum=0.5)

def train(epoch):
    running_loss = 0.0
    for batch_index, (inputs, labels) in enumerate(train_loader, 0):
        inputs, labels = inputs.to(device), labels.to(device)
        # 取出一张图片 1 x 28 x 28
        y_hat = model(inputs)
        loss = criterion(y_hat, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if batch_index % 10 == 9:
            print('[%d, %5d] loss: %.3f' % (epoch + 1, 
                   batch_index + 1, running_loss / 300))


def test():
    correct = 0
    total = 0
    with torch.no_grad():
        for (images, labels) in test_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            # tc.max 返回 值,位置
            _, pred = torch.max(outputs.data, dim=1)
            total += labels.size(0)
            correct += (pred == labels).sum().item()
    print('accuracy on test set: %d %%' % (100 * correct / total))
    return correct / total

* 调用函数模型：

if __name__ == '__main__':
    epoch_list = []
    acc_list = []

    for epoch in range(10):
        train(epoch)
        acc = test()
        epoch_list.append(epoch)
        acc_list.append(acc)

    plt.plot(epoch_list, acc_list)
    plt.grid(1)
    plt.xlabel('epoch')
    plt.ylabel('accuracy')
    plt.show()

Sauvignon.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
速通机器学习（2）- CNN(最基本的)

卷积参数共享：10filters x 5heighth x 5width x 3 channels ~ 750 features。28 x 28 x 1 ~ 28 x 28 x N ~ 多次调用filter,堆成n层，每次卷积核size一样。* 卷积：特征提取 - 获得特征图(1,2,3 色调通道相加) - 训练以获得最佳的 W_ij。* 输出规格： // C_h,C_w：卷积的规格，P：补零的规格，Step：扫描步长。* 感受野：7 ~ 7*7 C**2 > 3*(3*3) C**2。
复制链接

扫一扫

专栏目录