采用Googlenet处理kaggle遥感图像的全卷积神经网络计算

最新推荐文章于 2023-06-07 11:02:14 发布

一个不会读文献的参考文献

最新推荐文章于 2023-06-07 11:02:14 发布

阅读量221

点赞数

文章标签： cnn 深度学习神经网络

本文链接：https://blog.csdn.net/weixin_43730207/article/details/126264008

版权

GoogleNet

googlenet 和其他普通的卷积神经网络不同，不同之处在于它定义了一个1*1的卷积层，以及在池化层中添置了平均池化层。而在这之前的AlexNet、VGG等结构都是通过增大网络的深度（层数）来获得更好的训练效果，但层数的增加会带来很多负作用，比如overfit、梯度消失、梯度爆炸等。inception的提出则从另一种角度来提升训练结果：能更高效的利用计算资源，在相同的计算量下能提取到更多的特征，从而提升训练结果。（摘自百度百科）
在这里插入图片描述
他会将模型先拼接然后再切片分别通过1 * 1的卷积层以及平均池化层

Inception 模块

inception模块的基本机构如上图，整个inception结构就是由多个这样的inception模块串联起来的。inception结构的主要贡献有两个：一是使用1x1的卷积来进行升降维；二是在多个尺寸上同时进行卷积再聚合。

1*1的卷积层是做什么的

无论通道数是多少在1 * 1时都变成了N * W * H，也就是说目的是为了降维（通道数）
然后通过该卷积层出来的矩阵再进行相加强制将N * W * H变成1 * W * H

为什么采用Inception呢？

因为我们不知道卷积核是多少合适干脆就将1 * 1 ，3 * 3 ， 5 * 5 一起做运算，做完运算就可以知道哪个最合适，然后增高权重。

class InceptionA():
    def __init__(self,in_channels):
        super(InceptionA, in_channels).__init__()
        #
        
        self.branch1x1 = nn.Conv2d(in_channels , 16,kernel_size=1)#[16]
        
        self.branch5x5_1 = nn.Conv2d(in_channels , 16,kernel_size=1)
        self.branch5x5_2 = nn.Conv2d(16 ,24,kernel_size=5,padding=2)#[24]
        
        self.branch3x3_1 = nn.Conv2d(in_channels ,16,kernel_size=1)
        self.branch3x3_2 = nn.Conv2d(16 ,24,kernel_size=3,padding=1)
        self.branch3x3_3 = nn.Conv2d(24 ,24,kernel_size=3,padding=1)#[24]
        
        self.branch_pool =nn.Conv2d(in_channels,24,kernel_size=1)#[24]
    def forward():

        branch1x1 = self.branch1x1(x)
        
        branch5x5 = self.branch5x5_1(x)
        branch5x5 = self.branch5x5_2(branch5x5)
        
        branch3x3 = self.branch3x3_1(x)
        branch3x3 = self.branch3x3_2(branch3x3)
        branch3x3 = self.branch3x3_3(branch3x3)
        
        branch_pool = F.avg_pool2d(x,kernel_size=3,stride=1,padding=1)
        branch_pool = self.branch_pool(branch_pool)
        outputs=[branch1x1,branch5x5,branch3x3,branch_pool]
        return torch.cat(outputs,dim=1)
#通道数相加16+24+24+24=88

解释完Inception我们就可以上代码了。

代码

我的环境是Anaconda ，Python 3.7，Pytorch1.4

import time
import numpy as np
import matplotlib.pyplot as plt
import os
import torch
import torchvision
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import os
# import numpy as np
import cv2
import torch
import torch.nn as nn
import torchvision.transforms as transforms
# import pandas as pd
from torch.utils.data import DataLoader, Dataset
import time
torch.manual_seed(17)

def readfile(path, label):
    # 输入参数label为boolean变量，代表是否返回 y 值
    image_dir = sorted(os.listdir(path))#os.listdir返回文件夹包含的文件名字的列表
    x = np.zeros((len(image_dir), 224, 224, 3), dtype=np.uint8)#形状:文件个数*128*128*3
    y = np.zeros((len(image_dir)), dtype=np.uint8)#形状:文件个数
    for i, file in enumerate(image_dir):#遍历文件列表中文件名
        img = cv2.imread(os.path.join(path, file))#cv2读入原图片
        x[i, :, :] = cv2.resize(img,(224, 224))#对图片进行缩放, 存储到x的第i个元素中
        if label:
          y[i] = int(file.split("_")[0])#取出文件名中的类别信息
    if label:
      return x, y
    else:
      return x
workspace_dir = "D:/yinlichen/dataset/archive/data/cloudy"
print("Reading data")
train_x, train_y = readfile(os.path.join(workspace_dir, "training"), True)
print("Size of training data = {}".format(len(train_x)))
val_x, val_y = readfile(os.path.join(workspace_dir, "validation"), True)
print("Size of validation data = {}".format(len(val_x)))
test_x = readfile(os.path.join(workspace_dir, "testing"), False)
print("Size of Testing data = {}".format(len(test_x)))

然后对数据进行预处理

train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomHorizontalFlip(), # 随机将图片水平翻转
    transforms.RandomRotation(15), # 随机选择图片
    transforms.ToTensor(), # 图片转化为张量Tensor，並把數值 normalize 到 [0,1] (data normalization)
])

# 测试时不需做数据增强
test_transform = transforms.Compose([
    transforms.ToPILImage(),                                    
    transforms.ToTensor(),
])
class ImgDataset(Dataset):
    #初始化中把所有传入内容赋给属性
    def __init__(self, x, y=None, transform=None):#如果没有传入y, transform, 则默认值为0
        self.x = x
        # label类型应为 LongTensor
        self.y = y
        if y is not None:
            self.y = torch.LongTensor(y)
        self.transform = transform
    # 返回dataset的大小
    def __len__(self):
        return len(self.x)
    # 用[ ]取值時，dataset如何返回. 返回前先对x进行转换
    def __getitem__(self, index):
        X = self.x[index]
        if self.transform is not None:
            X = self.transform(X)
#             X = np.array(X, dtype=np.float32)  # PILImage->numpy 输出(h,w,c)
#             X = np.transpose(X, (2, 0, 1))  # np下维度转换使用transpose，类似矩阵转置
#             X = torch.from_numpy(X)  # numpy->tensor, 张量和ndarray共享同一内存, 不能调整大小
        if self.y is not None:
            Y = self.y[index]
            return X, Y
        else:
            return X

batch_size = 128
train_set = ImgDataset(train_x, train_y, train_transform)
val_set = ImgDataset(val_x, val_y, test_transform)
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_set, batch_size=batch_size, shuffle=False)

设置参数，这都当模板就行了其实

from torchvision.utils import make_grid

for images, _ in train_loader:
    plt.figure(figsize=(16,8))
    plt.axis('off')
    plt.imshow(make_grid(images, nrow=8).permute((1, 2, 0)))
    break

我运行这个的时候内核就挂掉了，于是你就需要重启一下内核，在此之前你需要在第一行写这个

#DataLoader:batch_szie=2,shuffle=True
    #batch_size 为几个一组
    #shuffle为是否打乱数据
%pylab inline 
import matplotlib.pyplot as plt 
import numpy as np
plt.plot(np.sin(np.linspace(0,2*np.pi, 100)))

我也不知道为什么，反正就能运行了

class Inception(nn.Module):
    
    def __init__(self, in_channels=3, use_auxiliary=True, num_classes=1000):
        super(Inception, self).__init__()
        
        self.conv1 = ConvBlock(in_channels, 64, kernel_size=7, stride=2, padding=3)
        self.conv2 = ConvBlock(64, 192, kernel_size=3, stride=1, padding=1)
        
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.avgpool = nn.AvgPool2d(kernel_size=7, stride=1)
        
        self.dropout = nn.Dropout(0.4)
        self.linear = nn.Linear(1024, num_classes)
        
        self.use_auxiliary = use_auxiliary
        if use_auxiliary:
            self.auxiliary4a = Auxiliary(512, num_classes)
            self.auxiliary4d = Auxiliary(528, num_classes)
        
        self.inception3a = InceptionBlock(192, 64, 96, 128, 16, 32, 32)
        self.inception3b = InceptionBlock(256, 128, 128, 192, 32, 96, 64)
        self.inception4a = InceptionBlock(480, 192, 96, 208, 16, 48, 64)
        self.inception4b = InceptionBlock(512, 160, 112, 224, 24, 64, 64)
        self.inception4c = InceptionBlock(512, 128, 128, 256, 24, 64, 64)
        self.inception4d = InceptionBlock(512, 112, 144, 288, 32, 64, 64)
        self.inception4e = InceptionBlock(528, 256, 160, 320, 32, 128, 128)
        self.inception5a = InceptionBlock(832, 256, 160, 320, 32, 128, 128)
        self.inception5b = InceptionBlock(832, 384, 192, 384, 48, 128, 128)

    def forward(self, x):
        y = None
        z = None
        
        x = self.conv1(x)
        x = self.maxpool(x)
        x = self.conv2(x)
        x = self.maxpool(x)
        
        x = self.inception3a(x)
        x = self.inception3b(x)
        x = self.maxpool(x)
        
        x = self.inception4a(x)
        if self.training and self.use_auxiliary:
            y = self.auxiliary4a(x)
        
        x = self.inception4b(x)
        x = self.inception4c(x)
        x = self.inception4d(x)
        if self.training and self.use_auxiliary:
            z = self.auxiliary4d(x)
        
        x = self.inception4e(x)
        x = self.maxpool(x)
        
        x = self.inception5a(x)
        x = self.inception5b(x)
        x = self.avgpool(x)
        x = x.reshape(x.shape[0], -1)
        x = self.dropout(x)
        
        x = self.linear(x)
        
        return x, y, z
class ConvBlock(nn.Module):
    
    def __init__(self, in_channels, out_channels, kernel_size, **kwargs):
        super(ConvBlock, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, **kwargs)
        self.bn = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU()
        
    def forward(self, x):
        return self.relu(self.bn(self.conv(x)))
class InceptionBlock(nn.Module):
    
    def __init__(self, im_channels, num_1x1, num_3x3_red, num_3x3, num_5x5_red, num_5x5, num_pool_proj):
        super(InceptionBlock, self).__init__()
        
        self.one_by_one = ConvBlock(im_channels, num_1x1, kernel_size=1)
        
        self.tree_by_three_red = ConvBlock(im_channels, num_3x3_red, kernel_size=1)  
        self.tree_by_three = ConvBlock(num_3x3_red, num_3x3, kernel_size=3, padding=1)
        
        self.five_by_five_red = ConvBlock(im_channels, num_5x5_red, kernel_size=1)
        self.five_by_five = ConvBlock(num_5x5_red, num_5x5, kernel_size=5, padding=2)
        
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
        self.pool_proj = ConvBlock(im_channels, num_pool_proj, kernel_size=1)
         
    def forward(self, x):
        x1 = self.one_by_one(x)
        
        x2 = self.tree_by_three_red(x)
        x2 = self.tree_by_three(x2)
        
        x3 = self.five_by_five_red(x)
        x3 = self.five_by_five(x3)
        
        x4 = self.maxpool(x)
        x4 = self.pool_proj(x4)
        
        x = torch.cat([x1, x2, x3, x4], 1)
        return x
class Auxiliary(nn.Module):
    
    def __init__(self, in_channels, num_classes):
        super(Auxiliary, self).__init__()
        self.avgpool = nn.AvgPool2d(kernel_size=5, stride=3)
        self.conv1x1 = ConvBlock(in_channels, 128, kernel_size=1)
        
        self.fc1 = nn.Linear(2048, 1024)
        self.fc2 = nn.Linear(1024, num_classes)
        
        self.dropout = nn.Dropout(0.7)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.avgpool(x)
        x = self.conv1x1(x)
        x = x.reshape(x.shape[0], -1)
        x = self.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        return x

构建模型，然后

model = Inception()
#define the device to use
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device
model.to(device)
next(model.parameters()).is_cuda
epochs = 50 
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0001, weight_decay=1e-4)
lr_scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience=5, verbose=True)
def train_model(model, dataloaders, criterion, optimizer, num_epochs=50, use_auxiliary=True):
    
    since = time.time()
    val_acc_history = []
    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        for phase in ['train', 'val']: # Each epoch has a training and validation phase
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            for inputs, labels in dataloaders[phase]: # Iterate over data
                
                inputs = inputs.to(device)

                labels = labels.to(device)

                optimizer.zero_grad() # Zero the parameter gradients

                with torch.set_grad_enabled(phase == 'train'): # Forward. Track history if only in train
                    
                    if phase == 'train': # Backward + optimize only if in training phase
                        if use_auxiliary:
                            outputs, aux1, aux2 = model(inputs)
                            loss = criterion(outputs, labels) + 0.3 * criterion(aux1, labels) + 0.3 * criterion(aux2, labels)
                        else:
                            outputs, _, _ = model(inputs)
                            loss = criterion(outputs, labels)
                            
                        _, preds = torch.max(outputs, 1)
                        loss.backward()
                        optimizer.step()
                    
                    if phase == 'val':
                        outputs, _, _ = model(inputs)
                        loss = criterion(outputs, labels)
                        _, preds = torch.max(outputs, 1)

                # Statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / len(dataloaders[phase].dataset)
            
            if phase == 'val': # Adjust learning rate based on val loss
                lr_scheduler.step(epoch_loss)
                
            epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
            if phase == 'val':
                val_acc_history.append(epoch_acc)

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model, val_acc_history

model, _ = train_model(model, {"train": train_loader, "val": val_loader}, criterion, optimizer, epochs)

运行大抵需要一个小时吧，就这样

一个不会读文献的参考文献

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫