一、Feature Extraction_特征提取层:输入维度、输出维度
卷积层:由1*28*28维度进行转化,卷积层卷积结果阶层和总维度不变
下采样:减少数据量(维度),通道数不变
二、Classification _分类器 —全连接:阶层维度变化,进行分类
三、通过卷积层后output H*W少几圈:计算卷积核中心到边缘的数量
四、RGB Channel卷积
- 单通道卷积输出:Single Input Channel
2)3 Input Channels
3)卷积核相关:
卷积核通道数 == 输入通道数 è(卷积过程需每个通道对应相乘计算)
卷积核数量 == 输出通道 需要将每个卷积核输出进行拼接
4)卷积层:四阶tensor
5)相关函数:
Padding = 1:在输入层添加外层零层,保证输出图像大小不变或者拓展图像像素
6)stride = 2()input索引坐标加2,缩小output维度
五、下采样
Max Pooling Layer 最大池化层:设定stride值,将input划分后寻max拼成output,在单通道之间进行寻max,不改变通道数
1)输入参量、卷积层参量、下采样:
input = torch.randn(Tensor)(batch_size,
in_channels,
width,
height)
#设定输入层的数据量、通道数、宽高,1*1*28*28
conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size =
)
#卷积层获取输入层通道数、输出层通道数和卷积核大小
conv_layer.weight.data = kernel.data #对卷积层权重进行初始化
maxpooling_layer = torch.nn.MaxPool2d(kernel_size = 2) #设计下采样降低维度不改变通道数
六、Code
CNN
七,使用GPU(需要下载相应的torch.device)
1、数据处理、模型设计
#卷积神经网络 <==> Design Model <==>GPU计算
import torch
from torchvision import transforms #图像像素维度变化
from torchvision import datasets
from torch.utils.data import DataLoader #mini-Batch
import torch.nn.functional as F #For Functional relu()
import torch.optim as optim #优化器
import matplotlib.pyplot as plt
import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
#prepare dataset
batch_size = 64
transform = transforms.Compose([transforms.ToTensor(), #将像素块的28*28 [0,255]像素值转化为1*28*28,[0,1]的tensor
transforms.Normalize((0.1307,),(0.3081,))]) #像素转换的方差、标准差
train_dataset = datasets.MNIST(root = 'D:\\anaconda3\\Lib\\site-packages\\torchvision\\datasets',
train = True ,download = True,transform = transform)
train_loader = DataLoader(train_dataset,shuffle = True, batch_size = batch_size)
test_dataset = datasets.MNIST(root = 'D:\\anaconda3\\Lib\\site-packages\\torchvision\\datasets',
train = False ,download = True,transform = transform)
test_loader = DataLoader(train_dataset,shuffle = False, batch_size = batch_size)
loss_list = []
acc_lsit = []
#design Model
class Model(torch.nn.Module):
def __init__(self):
super(Model,self).__init__()
self.conv1 = torch.nn.Conv2d(1,10,kernel_size = 5) #第一层输入通道数,输出通道数,卷积核大小
self.conv2 = torch.nn.Conv2d(10,20,kernel_size = 5)
self.pooling = torch.nn.MaxPool2d(2) #最大池化层为2*2
self.fc = torch.nn.Linear(320,10)#全连接层将数据维度对应到十个概率
def forward(self,x):
batch_size = x.size(0) #获取(n,1,28,28)中的n,为样本数量
x = F.relu(self.pooling(self.conv1(x)))
x = F.relu(self.pooling(self.conv2(x)))
x = x.view(batch_size,-1) #(batch_size,320)
return self.fc(x) #不添加非线性激活,得到原始的线性层输出x进行CrossEntropyLoss
model = Model()
#Loss and criterion
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(),lr =0.01,momentum = 0.5) #添加动量添加动量摆脱局部最小值
2、GPU运行设计
#GPU train
device = torch.device('cuda:0' if torch.cuba.is_avaliable() else 'cpu')
#device,如果当前cuda is avali 则使用 cuda0()显卡,否则使用CPU
model.to(device) #将整个模型权重、梯度迁移至device上(GPU)
3、Train & Test :需要将train和test中的inputs和target导入device进行计算
# Train and Test
def train (epoch):
loss_sum = 0.0
for index ,datas in enumerate (train_loader,0):
inputs ,target = datas
inputs ,target = inputs.to(device) ,target.to(device)#将计算过程迁移至device中
optimizer.zero_grad() #梯度初始化
#forward backward update
outputs =model(inputs)
loss = criterion(outputs ,target)
loss.backward()
optimizer.step()
loss_sum +=loss.item()
loss_list.append(loss.item())
if index %300 ==299:
print('Epoch = %d' %epoch)
print('\t','Index = ',index,'\t','Loss = %.3f' % (loss_sum/300))
loss_sum = 0.0
test()
def test():
correct = 0
total = 0
with torch.no_grad(): #不进行参数梯度计算
for data in test_loader:
images ,labels =data
images ,labels = images.to(device) ,labels.to(device)#将计算过程迁移至device中
outputs = model(images)
_,predicted = torch.max(outputs.data ,dim =1) #最大值的索引值
total +=labels.size(0)
correct +=(predicted == labels).sum().item()
print('Accuracy on test set :%d %%'%(100 * correct / total))
for epoch in range(10):
train(epoch)
plt.plot(range(len(train_loader)*10), loss_list, color='b')
plt.xlabel('epoch')
plt.ylabel('cost')
plt.show()
八、输入、输出、卷积层权重维度
import torch
in_channels , out_channels = 5,10
width ,height = 100,100
kernel_size = 3
batch_size = 1
input = torch.randn(batch_size,
in_channels,
width,
height)
conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size = kernel_size)
output = conv_layer(input)
print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)
通过randn对权重进行赋值
torch.Size([1, 5, 100, 100])
torch.Size([1, 10, 98, 98])
torch.Size([10, 5, 3, 3])
kernel:通过设定kernel的大小,对其进行view(改变阶层,总维度不变)
# 输出通道数=1,输入通道数=1
kernel = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8,
9]).view(1, 1, 3, 3)
conv_layer.weight.data = kernel.data
# 赋值给卷积层的权重
九、卷积层、特征层、输出层、权重关系
卷积层:
卷积层即对输入数据进行特征提取,包括卷积核大小、步长和填充,是卷积神经网路的超参数,决定了卷积层输出特征图的尺寸。卷积层中包含激励函数以协助表达复杂特征,如ReLU
卷积核:组成卷积核的每个元素对应一个权重系数和一个偏差 卷积层参数=权重
特征层、输出层:
特征层feature map=输出层(除去一维、二维输出) Conv,MaxPooling等层都可以得到输出层/特征层需要输入图像
对应关系:
y=f(x)
x:输入
f():卷积层
y:输出层
网络结构图:
网络结构图,有用输出层表示的,有用Conv,MaxPooling等结构表示且现在大多是这种,包含卷积核等信息
十、总结
该部分batch为一次性输入的单个识别图像的数量,x输入tenso为(n,1,28,28),n为样本数量,因此采用x.size(0)获取n
由于输入的是黑白图像,因此输入通过的通道数为1
卷积层:改变通道数和维度(取决于卷积核中心到边缘的距离),不改变阶数
输入进行层级之间的变化,但是整体维度和阶层不变
最大池化层:不改变通道数,对每层输入进行分块求max,对数据进行降维
输出x.view(batch_size,-1):需要输出每个样本对应的线性层输出,因此获取batch_size,-1为自适应保证维度不变
self.nn.Linear(320,10):最后通过全连接层将维度batch_size*320,映射到batch_size*10,进而计算CrossEntropyLoss