AlexNet 原理及代码实现
基础介绍:
原始图像:256X256X3
图像处理:数据增强
1. 随机剪切256X256x3<=>224x224x3
2. 224x224旋转,位置变换
3. 图像增大:224x224<=>227x227
4. 实际输入AlexNet网络:227x227x3(可以通过padding实现)
输入图像为3通道,之后采用双GPU并行运算
基础步骤:
卷积层---->relu---->池化层---->norm层---->全连接层
总分为5个卷积,3个全连接
输入为:227x227x3
conv1:
1. filter:11x11x3
2. padding: 0
3. stride: 4
4. depth: 48x2 = 96
max pooling:
1. filter:3x3
2. stride:2
conv2:
1. filter:5x5x48
2. padding: 2
3. stride:1
4. depth:128x2 =256
max pooling:
1. filter:3x3
2. stride:2
conv3:
1. filter:3x3x256
2. padding: 1
3. stride:1
4. depth:192x2 = 384
conv4:
1. filter:3x3x192
2. padding:1
3. stride:1
4. depth:192x2 = 384
conv5:
1. filter:3x3x192
2. padding:1
3. stride:1
4. depth:128x2
max pooling:
1. filter:3x3
2. stride:2
fc1:
6x6x128x2=9216==>2048x2
fc2:
2048x2=4096==>2048x2
fc2:
4096==>1000
ReLU:线性归一
norm层:去均值,幅度归一化
import torch
import torch.nn as nn
class AlexNet(nn.Module):
def __init__(self, num_classes=1000):
self.featrues = nn.Sequential(
# conv1
nn.Conv2d(3, 96, kernel_size=11, stride=4),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.BatchNorm2d(96),
# conv2
nn.Conv2d(96, 256, kernel_size=5, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.BatchNorm2d(256),
# conv3
nn.Conv2d(256, 384, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
# conv4
nn.Conv2d(384, 384, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
# conv5
nn.Conv2d(384, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2))
self.classifier = nn.Sequential(
# fc1
nn.Linear(256 * 6 * 6, 4096),
nn.ReLU(inplace=True),
# fc2
nn.Linear(4096, 4096),
nn.ReLU(inplace=True),
# fc3
nn.Linear(4096, num_classes))
def forward(self, x):
#参数过滤调整
x = self.features(x)
x = x.view(x.size(0), 256 * 6 * 6)
x = self.classifier(x)
return x
net=AlexNet()