Rsenet残差网络笔记

1 环境准备
1.1 NVIDIA驱动
我的操作系统是18.04,NVIDIA官网按照Ubuntu安装Nvidia驱动,我从官方网站中下载安装包,但执行./NVIDIA-Linux-x86_64-430.50.run --no-opengl-file,提示异常信息,于是我直接执行./NVIDIA-Linux-x86_64-430.50.run照着说明操作就可以了。

./nvidia-installer: unrecognized option: "--no-opengl-file"
ERROR: Invalid commandline, please run `./nvidia-installer --help` for usage information
sudo apt-get remove –purge nvidia*
E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily unavailable)
E: Unable to lock the administration directory (/var/lib/dpkg/), is another process using it?
# 解决方案
sudo rm /var/cache/apt/archives/lock
sudo rm -f /var/lib/dpkg/lock
sudo dpkg --configure -a

1.2 安装torch环境

sudo adduser test
chmod u+w /etc/sudoers
echo "test ALL=(ALL) ALL" >> /etc/sudoers 
chmod u-w /etc/sudoers
# E: dpkg was interrupted, you must manually run 'sudo dpkg --configure -a' to correct the problem. 
sudo dpkg --configure -a
# 安装pip3
sudo apt install python3-pip

1.3 安装远程环境
因为我的电脑上并不带gpu显卡,于是想到在ubuntu上安装vnc4server和Jupyter ,直接在上面编程就可以了。
1.3.1 vnc

sudo apt-get install vnc4server
sudo apt install ubuntu-desktop gnome-panel gnome-settings-daemon metacity nautilus gnome-terminal -y
vnc4server
vi ~/.vnc/xstartup
#!/bin/sh
xsetroot -solid grey
vncconfig -iconic &
gnome-session &
gnome-panel &
gnome-settings-daemon &
metacity &
nautilus &
gnome-terminal &
# 启动vnc server
vnc4server
# 关闭vnc server
vnc4server -kill :1

1.3.2 Jupyter Notebook

pip3 install jupyter --user
cd ~/workspace
nohup jupyter notebook --ip=0.0.0.0 --port=8000 > jupyter.out 2>&1 &

或者按照jupyter-使用及设置密码,更改配置文件

c.NotebookApp.password ='sha1:702f1e914d32:d0ba8e455e642ffe70554c6e12dc3a6bd3e3ce2c'
c.NotebookApp.ip='0.0.0.0'
c.NotebookApp.port=8000
c.NotebookApp.notebook_dir='/home/test/workspace'

参考upyter Notebook主题字体设置及自动代码补全

pip3 install --no-dependencies jupyterthemes==0.18.2 --user
pip3 install lesscpy --user
pip3 install --user matplotlib -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
jt --lineh 140 -f consolamono -tf ptmono -t grade3 -ofs 14 -nfs 14 -tfs 14 -fs 14 -T -N
nohup jupyter notebook > jupyter.out 2>&1 &

将下图红色方框汇总的url复制到浏览器中,即可。27 个Jupyter Notebook的小提示与技巧
1
3
2 psenet主干网络
PSENet源码神经网络的25个必熟概念,psenet使用的主干网络是Resnet
2.1 ResNet
经典分类CNN模型系列其四:Resnet,阅读这篇文章了解到Resnet是用于特征提取,文中描述CNN分类网络自Alexnet的7层发展到19、22层,性能并没有明显提高,反而会招致网络收敛变得更慢,深度学习—残差resnet网络原理详解可以得到图形映证。我的解读是一、神经网络,搞工程化的人不用去打,应该交给科学实践者。二、网络层次并不是越多越好,性能如果通过简单的增加层次只会区域平稳。
Resnet的出现就是为了解决网络深度变深后的性能退化问题。
2.1.1 残差学习
什么是残差——一文让你读懂GBDT(梯度提升树) 和 Resnet (残差网络)的原理,从这篇文章我学习到残差是实际观察值和估计值之差,注意损失函数的含义更为广泛。再看看深度神经网络优化策略之——残差学习,还有深度学习经典网络(4)ResNet深度残差网络结构详解
假设第l层输入为 X l X_l Xl
Y l = W l ∗ X l + b l Y_l=W_l*X_l+b_l Yl=WlXl+bl
X l + 1 = r e l u ( Y l ) X_{l+1}=relu(Y_l) Xl+1=relu(Yl)
Y l + 1 = W l + 1 ∗ X l + 1 + b l + 1 Y_{l+1}=W_{l+1}*X_{l+1}+b_{l+1} Yl+1=Wl+1Xl+1+bl+1
r e l u ( x ) = m a x ( 0 , x ) ⟹ X l + 2 = r e l u ( Y l + 1 ) ⟹ r e l u ( W l + 1 ∗ X l + 1 + b l + 1 ) ⟹ W l + 1 ∗ ( W l ∗ X l + b l ) + b l + 1 relu(x)=max(0,x) \Longrightarrow X_{l+2}=relu(Y_{l+1}) \Longrightarrow relu(W_{l+1}*X_{l+1}+b_{l+1}) \Longrightarrow W_{l+1}*(W_l*X_l+b_l)+b_{l+1} relu(x)=max(0,x)Xl+2=relu(Yl+1)relu(Wl+1Xl+1+bl+1)Wl+1(WlXl+bl)+bl+1
定义 F ( W l , X l , b l ) = W l ∗ X l + b l F(W_l,X_l,b_l)=W_l*X_l+b_l F(Wl,Xl,bl)=WlXl+bl,到了第L层,计算公示则为
X L = F ( W L , X L ) = X l + ∑ i = 1 L − 1 F ( X i , W i ) X_L=F(W_L,X_L)=X_l+\sum_{i=1}^{L-1} F(X_i,W_i) XL=F(WL,XL)=Xl+i=1L1F(Xi,Wi),根据这个公式残差网络是怎么跳过去的呢?这里就需要了解什么identity mappings

  • identity mapping(恒定映射)
    主干网络系列(2) -ResNet V2:深度残差网络中的恒等映射这篇文章还是没有清楚明白告诉我恒等映射(变换)到底是什么。而恒等映射则偏数学,看的不形象。等式两边恒等,因为使用到relu函数的特性才有可能。
  • 残差函数
    上述 ∑ i = 1 L − 1 F ( X i , W i ) \sum_{i=1}^{L-1} F(X_i,W_i) i=1L1F(Xi,Wi)标识任意单元L和l之间的残差。ResNet:Identity Mappings in Deep Residual Networks论文笔记
  • 梯度消失和梯度爆炸
    对于梯度消失和梯度爆炸的理解,sigmoid函数的缺陷,梯度小到0,就是消失了,大到没有变就是爆炸了。梯度 δ \delta δ接近0,权重w的就更新不动了。 w ′ = w − η ∗ δ w_{'}=w-\eta*\delta w=wηδ
    于是就替代方案改用relu,因为relu函数求导还是1.
    另外既然权重已经不更新了,学不动了,那么再增加层次已经没有意义了。
    于是就有人想到采用 h ( x ) = f ( x ) + x h(x)=f(x)+x h(x)=f(x)+x,如果想让 h ( x ) ≡ x ⟹ f ( x ) = 0 h(x)\equiv x\Longrightarrow f(x)=0 h(x)xf(x)=0,现在的问题也就变成了 f ( x ) = w ∗ x + b f(x)=w*x+b f(x)=wx+b残差函数中权重和偏置的学习了。
  • 残差函数为什么可以解决梯度消失的问题
    既然前面说梯度消失,权重和偏置已经学不懂了,那么为什么增加残差后,就又能解决此问题呢?将 X L = X l + ∑ i = 1 L − 1 F ( X i , W i ) X_L=X_l+\sum_{i=1}^{L-1} F(X_i,W_i) XL=Xl+i=1L1F(Xi,Wi)进行反向求导,根据链式求导法则
    ∂ E t o t a l ∂ x l = ∂ E t o t a l ∂ x L ∗ ∂ x L ∂ x l = ∂ E t o t a l ∂ x L ∗ ∂ ( x l + ∑ i = l L − 1 F ( x i , w i ) ) ∂ x l = ∂ E t o t a l ∂ x L ∗ ( 1 + ∂ ∑ i = l L − 1 F ( x i , w i ) ∂ x l ) \frac{\partial E_{total}}{\partial x_l}=\frac{\partial E_{total}}{\partial x_L}*\frac{\partial x_L}{\partial x_l}=\frac{\partial E_{total}}{\partial x_L}*\frac{\partial (x_l+\sum_{i=l}^{L-1}F(x_i,w_i))}{\partial x_l}=\frac{\partial E_{total}}{\partial x_L}*(1+\frac{\partial \sum_{i=l}^{L-1}F(x_i,w_i)}{\partial x_l}) xlEtotal=xLEtotalxlxL=xLEtotalxl(xl+i=lL1F(xi,wi))=xLEtotal(1+xli=lL1F(xi,wi))
    这里回忆一下卷积神经网络的反向传播,取自我的pytorch 神经网络基本笔记,再反向求导的时候,通过链式求导法则,可以看到普通的神经网络中梯度是根据各参数导数的乘积而成,而乘积效应要么变大,要么变小,也就是所说的梯度消失或者梯度爆炸。
    ∂ E t o t a l ∂ w 5 = ∂ E t o t a l ∂ o u t o 1 ∗ ∂ o u t o 1 ∂ n e t o 1 ∗ ∂ n e t o 1 ∂ w 5 \frac{\partial E_{total}}{\partial w_5}=\frac{\partial E_{total}}{\partial out_{o1}}*\frac{\partial out_{o1}}{\partial net_{o1}}*\frac{\partial net_{o1}}{\partial w_5} w5Etotal=outo1Etotalneto1outo1w5neto1
    再看看下面的公式,根据微积分中 ( u ± v ) ′ = u ′ ± v ′ (u\pm v)^{'}=u^{'}\pm v^{'} (u±v)=u±v,那么可以看到由原先的乘数变成了加法,梯度消失或梯度爆炸的问题就解决了。
    ∂ E t o t a l ∂ x l = ∂ E t o t a l ∂ x L ∗ ( 1 + ∂ ∑ i = l L − 1 F ( x i , w i ) ∂ x l ) = ∂ E t o t a l ∂ x L ∗ ( 1 + ∑ i = l L − 1 ∂ F ( x i , w i ) ∂ x l ) \frac{\partial E_{total}}{\partial x_l}=\frac{\partial E_{total}}{\partial x_L}*(1+\frac{\partial \sum_{i=l}^{L-1}F(x_i,w_i)}{\partial x_l})=\frac{\partial E_{total}}{\partial x_L}*(1+\sum_{i=l}^{L-1}\frac{\partial F(x_i,w_i)}{\partial x_l}) xlEtotal=xLEtotal(1+xli=lL1F(xi,wi))=xLEtotal(1+i=lL1xlF(xi,wi))

2.1.2 z-score 标准化
z-score使用于特征的最大值和最小值位置的情况,或者有超过取值范围的离群数据的情况。要求原始数据可近似高斯分布,否则归一化结果不好。
z-score标准化只是标准化方法的一种,目的都是提高数据的可比性,削弱数据的解释性。
nn.BatchNorm2d正是用到了z-score标准化

  • 高斯分布
    高斯分布,就是正太分布,换个名字差点不认识了。透彻理解高斯分布
  • 中心化
    pytorch之添加BN,中心化用于修正数据的中心位置。从数据中心化中的示例可以看到数据更聚焦了。
    x ′ = x − x ‾ , 其 中 x ‾ = 1 N ∑ i = 1 N x i , 为 均 值 x^{'}=x- \overline {x},其中\overline{x}=\frac{1}{N}\sum_{i=1}^{N}x_i,为均值 x=xx,x=N1i=1Nxi,
    中心化后的数据,均值为0,这个很容易理解
    x ′ ‾ = 1 N ∑ i = 1 N ( x i − x ‾ ) = 1 N ∑ i = 1 N x i − 1 N ∗ N ∗ x ‾ = x ‾ − x ‾ = 0 \overline{x^{'}}=\frac{1}{N}\sum_{i=1}^{N}(x_i-\overline{x})=\frac{1}{N}\sum_{i=1}^{N}x_i-\frac{1}{N}*N*\overline{x}=\overline{x}-\overline{x}=0 x=N1i=1N(xix)=N1i=1NxiN1Nx=xx=0
  • 标准化
    先恶补一下标准差,又称为均方差,从下面的公式明白标准差 σ \sigma σ是各数与平均数距离的平均值。
    平均值 μ \mu μ相同,但 σ \sigma σ不一定相同。检测值与真实值之间的差距就是评价检测方法最有决定性的指标。
    x ′ = x − μ σ , 其 中 μ = x ‾ , σ = 1 N ∑ i = 1 N ( x i − x ‾ ) x^{'}=\frac{x-\mu}{\sigma},其中\mu=\overline{x},\sigma=\sqrt{\frac{1}{N}\sum_{i=1}^{N}(x_i-\overline{x})} x=σxμ,μ=x,σ=N1i=1N(xix)
    z-score后的标准差为1,证明公式设计概率统计的知识.
    首先是数学期望,为什么有这个,在实际问题中,更关注随机变量的某个特征。能够刻画某一方面特征的常数叫数学特征。这些特征包括:数学期望、方差,相关系数和矩。数学期望刻画X的平均大小,用 E ( x ) E(x) E(x)表示,简称均值或期望。
    接着是方差,描述与均值之间的偏离程度。
    方 差 : D ( X ) = V a r ( X ) = E [ X − E ( X ) ] 2 = E ( X 2 ) − [ E ( X ) ] 2 方差:D(X)=Var(X)=E{[X-E(X)]^2}=E(X^2)-[E(X)]^2 D(X)=Var(X)=E[XE(X)]2=E(X2)[E(X)]2
    均 方 差 ( 标 准 差 ) : D ( X ) = V a r ( X ) = E [ X − E ( X ) ] 2 ⟹ 当 E ( x ′ ) = 0 , x ′ = x − μ σ , 则 D ( x ′ ) = E ( ( x ′ ) 2 ) − ( E ( x ′ ) ) 2 = E ( ( x ′ ) 2 ) = 1 σ 2 E ( x − μ ) 2 , 根 据 方 差 公 式 D ( x ) = E [ X − E ( x ) ] 2 ⟹ D ( x ′ ) = σ 2 σ 2 = 1 均方差(标准差):\sqrt{D(X)}=\sqrt{Var(X)}=E{[X-E(X)]^2}\Longrightarrow 当E(x^{'})=0,x^{'}=\frac{x-\mu}{\sigma},则D(x^{'})=E((x^{'})^{2})-(E(x^{'}))^2=E((x^{'})^{2})=\frac{1}{\sigma^2}E(x-\mu)^2,根据方差公式D(x)=E[X-E(x)]^2\Longrightarrow D(x^{'})=\frac{\sigma^2}{\sigma^2}=1 D(X) =Var(X) =E[XE(X)]2E(x)=0,x=σxμ,D(x)=E((x)2)(E(x))2=E((x)2)=σ21E(xμ)2,D(x)=E[XE(x)]2D(x)=σ2σ2=1

2.1.3 BN层
神经网络之BN层Batch normalization理解,从下图可以看到BN层正是使用z-score标准化,从下图可以看到 ϵ \epsilon ϵ,这个值用于保证数据的稳定性,避免分母不趋近或取0。具体参见pytorch中文文档,公式的最后一步推导,对标准化后的值,再做线性变换。
BN层做z-score标准化,已经假定了数据服从正态分布。因为期望为0,标准为1的数据分布正是正态分布。
1

# 模拟bn层计算
import torch
from torch import nn
import numpy as np
# 定义feature map:N*C*H*W,其中N为样本数量,每个样本通道数为C,高为H,宽为W,
# 按照下面的定义10个样本,3个通道,高度为32,宽度为32
# 通过rand方法产生10个符合0-1分布的数据。
X = np.random.rand(10, 3, 32, 32) * 1000
# 先做transpose转置,使数据(N C H W)成为(C N H W)
# X.reshape(X.shape[0],-1)将维度为(a,b,c,d)的矩阵转换为一个维度为(b*c*d,a)的矩阵
# 这里保留通道C的维度
# 这里X1由(10,3,32,32)的矩阵转换为(3,10240)的矩阵
X1 = X.transpose((1, 0, 2, 3)).reshape(3, -1)
# 计算均值
# axis=1 对各行求平均值,axis=0对各列求平均值,则X1.mean(axis=1)为(3,)的矩阵
# 为了做z-score归一化,必须做reshape
# reshape(1, 3, 1, 1)将均值,转变为X数组可以计算的维度,注意保留通道数据要相同
mu = X1.mean(axis=1).reshape(1, 3, 1, 1)
print(mu)
# 计算标准差
sigma = X1.std(axis=1).reshape(1, 3, 1, 1)
# 稳定系数
epsilon = 1e-5
# z-score归一化
bn = (X - mu) / (sigma + epsilon)
# torch中的BN模型,还做了一次线性变换
model = nn.BatchNorm2d(num_features=30, eps=epsilon, affine=False, track_running_stats=False)
X2 = torch.from_numpy(X)
t_bn = model(X2)
# 与官方的偏差
print('diff:{}'.format((t_bn.numpy() - bn).sum()))

2.1.2 Basic Block
代码的解读参考ResNet详解,文中指出梯度下降法训练神经网络,需要调整各类超参,比如学习率等,参数的选择对结果影响较大,也较为耗时。BN算法正式为了改善人为调参而设计的。
2.1.2.1 ResNet18
ResNet18这个网络包含17个卷积层和1个全链接层,18并不计算池化层和BN层,resnet源码介绍,既然官方已经提供了多种resnet网络模型
1

if stride!=1:
           self.downsample=Sequential()
           self.downsample.add(layers.Conv2D(filter_num,(1,1),strides=stride))
       else:
           self.downsample=lambda x:x

而在torch中

# 下面的代码片段在BasicBlock中
 if self.downsample is not None:
           residual = self.downsample(x)
#下面的代码在_make_layer中
downsample = None
       if stride != 1 or self.inplanes != planes * block.expansion:
           downsample = nn.Sequential(
               nn.Conv2d(self.inplanes, planes * block.expansion,
                         kernel_size=1, stride=stride, bias=False),
               nn.BatchNorm2d(planes * block.expansion),
           )
  • 三层残差快
class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(inplanes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out

2.1.3 Bottleneck Block

class Bottleneck(nn.Module):
    expansion = 4

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
        self.bn3 = nn.BatchNorm2d(planes * 4)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out
class ResNet(nn.Module):

    def __init__(self, block, layers, num_classes=7, scale=1):
        self.inplanes = 64
        super(ResNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
         # nn.BatchNorm2d的作用防止梯度消失或爆炸,设置的参数就是卷积的输出通道数
        self.bn1 = nn.BatchNorm2d(64)
        # inplace=True,对从上层网络Conv2d中传递下来的tensor直接进行修改,这样能够节省运算内存,不用多存储其他变量
        self.relu1 = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        # self.avgpool = nn.AvgPool2d(7, stride=1)
        # self.fc = nn.Linear(512 * block.expansion, num_classes)

        # Top layer
        self.toplayer = nn.Conv2d(2048, 256, kernel_size=1, stride=1, padding=0)  # Reduce channels
        self.toplayer_bn = nn.BatchNorm2d(256)
        self.toplayer_relu = nn.ReLU(inplace=True)

        # Smooth layers
        self.smooth1 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)
        self.smooth1_bn = nn.BatchNorm2d(256)
        self.smooth1_relu = nn.ReLU(inplace=True)

        self.smooth2 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)
        self.smooth2_bn = nn.BatchNorm2d(256)
        self.smooth2_relu = nn.ReLU(inplace=True)

        self.smooth3 = nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1)
        self.smooth3_bn = nn.BatchNorm2d(256)
        self.smooth3_relu = nn.ReLU(inplace=True)

        # Lateral layers
        self.latlayer1 = nn.Conv2d(1024, 256, kernel_size=1, stride=1, padding=0)
        self.latlayer1_bn = nn.BatchNorm2d(256)
        self.latlayer1_relu = nn.ReLU(inplace=True)

        self.latlayer2 = nn.Conv2d(512,  256, kernel_size=1, stride=1, padding=0)
        self.latlayer2_bn = nn.BatchNorm2d(256)
        self.latlayer2_relu = nn.ReLU(inplace=True)

        self.latlayer3 = nn.Conv2d(256,  256, kernel_size=1, stride=1, padding=0)
        self.latlayer3_bn = nn.BatchNorm2d(256)
        self.latlayer3_relu = nn.ReLU(inplace=True)

        self.conv2 = nn.Conv2d(1024, 256, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(256)
        self.relu2 = nn.ReLU(inplace=True)
        self.conv3 = nn.Conv2d(256, num_classes, kernel_size=1, stride=1, padding=0)

        self.scale = scale
        
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)

    def _upsample(self, x, y, scale=1):
        _, _, H, W = y.size()
        return F.upsample(x, size=(H // scale, W // scale), mode='bilinear')

    def _upsample_add(self, x, y):
        _, _, H, W = y.size()
        return F.upsample(x, size=(H, W), mode='bilinear') + y

    def forward(self, x):
        h = x
        h = self.conv1(h)
        h = self.bn1(h)
        h = self.relu1(h)
        h = self.maxpool(h)

        h = self.layer1(h)
        c2 = h
        h = self.layer2(h)
        c3 = h
        h = self.layer3(h)
        c4 = h
        h = self.layer4(h)
        c5 = h

        # Top-down
        p5 = self.toplayer(c5)
        p5 = self.toplayer_relu(self.toplayer_bn(p5))

        c4 = self.latlayer1(c4)
        c4 = self.latlayer1_relu(self.latlayer1_bn(c4))
        p4 = self._upsample_add(p5, c4)
        p4 = self.smooth1(p4)
        p4 = self.smooth1_relu(self.smooth1_bn(p4))

        c3 = self.latlayer2(c3)
        c3 = self.latlayer2_relu(self.latlayer2_bn(c3))
        p3 = self._upsample_add(p4, c3)
        p3 = self.smooth2(p3)
        p3 = self.smooth2_relu(self.smooth2_bn(p3))        

        c2 = self.latlayer3(c2)
        c2 = self.latlayer3_relu(self.latlayer3_bn(c2))
        p2 = self._upsample_add(p3, c2)
        p2 = self.smooth3(p2)
        p2 = self.smooth3_relu(self.smooth3_bn(p2))

        p3 = self._upsample(p3, p2)
        p4 = self._upsample(p4, p2)
        p5 = self._upsample(p5, p2)

        out = torch.cat((p2, p3, p4, p5), 1)
        out = self.conv2(out)
        out = self.relu2(self.bn2(out))
        out = self.conv3(out)
        out = self._upsample(out, x, scale=self.scale)

        return out

3 Progressive Scale Expansion Algorithm-PSE
最全的曲文检测整理

4 基于tensorflow2.0的rsenet
resnet可视化参考TensorFlow学习笔记中关于tensorboard的描述。

import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, Sequential,datasets

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
tf.random.set_seed(2311)

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')

# tf.config.experimental.set_virtual_device_configuration(
#     gpus[0],
#     [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=3072)]
# )

for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

def conv3x3(out_dims, stride=1):
    '''
    卷积核3x3
    '''
    return layers.Conv2D(filters=out_dims, kernel_size=3, strides=stride, padding='same')


def conv1x1(out_dims, stride=1):
    '''
    卷积核1x1
    '''
    return layers.Conv2D(filters=out_dims, kernel_size=1, strides=stride)


class BasicBlock(layers.Layer):
    '''
    两层残差块,用于用于resnet18/34
    Res Block由多个BasicBlock组成
    '''
    #
    expansion = 1

    def __init__(self, out_dims, stride=1, downsample=None):
        '''
        @param out_dims:输入空间的维度
        @param stride:步长,默认为1,根据实际传入参数设定
        '''
        super(BasicBlock, self).__init__()
        # 残差快中第一层
        self.conv1 = conv3x3(out_dims, stride)
        self.bn1 = layers.BatchNormalization()
        self.relu = layers.Activation('relu')
        # 残差快中第二层
        self.conv2 = conv3x3(out_dims, stride=1)
        self.bn2 = layers.BatchNormalization()
        # 下采样
        self.downsample = downsample
        self.stride = stride

    def call(self, x):
        identity = x
        #
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        #
        out = self.conv2(out)
        out = self.bn2(out)
        # 当连接的维度不同时,使用1x1的卷积核将低维转成高维,然后才能进行相加
        if self.downsample:
            identity = self.downsample(x)
        # 实现 H(x) = F(x)+x
        out = layers.add([out, identity])
        out = self.relu(out)
        return out


class Bottleneck(layers.Layer):
    '''
    三层残差快,用于resnet50/101/152
    '''
    #
    expansion = 4

    def __init__(self, out_dims, stride=1, downsample=None):
        super(Bottleneck, self).__init__()
        # 残差块第一层
        self.conv1 = conv1x1(out_dims)
        self.bn1 = layers.BatchNormalization()
        # 残差块第二层
        self.conv2 = conv3x3(out_dims, stride)
        self.bn2 = layers.BatchNormalization()
        # 残差块第三层
        self.conv3 = conv1x1(out_dims * self.expansion)
        self.bn3 = layers.BatchNormalization()
        self.relu = layers.Activation('relu')
        #
        self.downsample = downsample
        self.stride = stride

    def call(self, x):
        identity = x
        #
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        #
        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)
        #
        out = self.conv3(out)
        out = self.bn3(out)
        #
        if self.downsample:
            identity = self.downsample(x)
        out = layers.add([out, identity])
        out = self.relu(out)
        #
        return out


class ResNet(keras.Model):
    '''
    残差网络
    '''

    def __init__(self, block, block_cn, num_classes=10):
        '''
        @param block: 指明残差块是两层或三层
        @param block_cn: 每个卷积层需要的残差块数量
        @param num_classes: 分类数
        '''
        super(ResNet, self).__init__()
        self.in_dims =64
        # 预处理层
        # 将3通道的RGB图像数据变为64通道的数据,采用7x7的卷积核
        # @see https://www.cnblogs.com/wanghui-garcia/p/10775860.html
        self.stem = Sequential([
            layers.Conv2D(filters=64, kernel_size=7, strides=2, padding='same'),
            layers.BatchNormalization(),
            layers.Activation('relu'),
            layers.MaxPool2D(pool_size=3, strides=2, padding='same')
        ])
        # 注意第1项不一定以2倍形式扩张,都是比较随意的,这里都是经验值
        self.layer1 = self._make_layer(block, 64, block_cn[0])
        self.layer2 = self._make_layer(block, 128, block_cn[1], stride=2)
        self.layer3 = self._make_layer(block, 256, block_cn[2], stride=2)
        self.layer4 = self._make_layer(block, 512, block_cn[3], stride=2)
        # top layers
        self.avgpool=layers.GlobalAveragePooling2D()
        self.fc=layers.Dense(num_classes)

    def _make_layer(self, block, out_dims, block_cn, stride=1):
        '''
        @param block_cn: 该卷积层有多少个残差块
        '''
        downsample = None
        layer = Sequential()
        # 下采样
        if stride != 1 or self.in_dims != out_dims * block.expansion:
            downsample = Sequential([
                conv1x1(out_dims * block.expansion, stride),
                layers.BatchNormalization()
            ])
        #
        layer.add(block(out_dims, stride, downsample))
        self.in_dims = out_dims * block.expansion
        for i in range(1, block_cn):
            layer.add(block(out_dims))
        return layer

    def _up_sample(self, x, y):
        '''
        上采样
        [b,h,w,c]
        '''
        _, H, W, _ = y.shape
        _, h, w, _ = x.shape
        return layers.UpSampling2D(size=(H // h, W // w))(x)

    def _upsample_add(self, x, y):
        _, H, W, _ = y.shape
        _, h, w, _ = x.shape
        return layers.UpSampling2D(size=(H // h, W // w),interpolation='bilinear')(x) + y

    def call(self, x):
        # 预处理
        out = self.stem(x)
        #
        c2 = self.layer1(out)
        c3 = self.layer2(c2)
        c4 = self.layer3(c3)
        c5 = self.layer4(c4)
        out=self.avgpool(c5)
        out=self.fc(out)
        return out


def resnet18(**kwargs):
    model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
    return model


def resnet34(**kwargs):
    model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
    return model


def resnet50(**kwargs):
    model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
    return model


def resnet101(**kwargs):
    model = ResNet(Bottleneck, [3, 4, 23, 3], **kwargs)
    return model


def resnet152(**kwargs):
    model = ResNet(Bottleneck, [3, 8, 36, 3], **kwargs)
    return model

# 训练
def preprocess(x,y):
    x=2*tf.cast(x,dtype=tf.float32)/255.-1
    y=tf.cast(y,dtype=tf.int32)
    return x,y
(x_train,y_train),(x_test,y_test) = datasets.cifar10.load_data()
y_train = tf.squeeze(y_train,axis=1)
y_test = tf.squeeze(y_test,axis=1)
train_data=tf.data.Dataset.from_tensor_slices((x_train,y_train))
train_data=train_data.shuffle(1000).map(preprocess).batch(64)
test_data=tf.data.Dataset.from_tensor_slices((x_test,y_test))
test_data=test_data.map(preprocess).batch(64)
sample=next(iter(train_data))
print('sample:',sample[0].shape,sample[1].shape,
      tf.reduce_min(sample[0]),tf.reduce_max(sample[0]))

def main():
    model=resnet18()
    model.build(input_shape=(None,32,32,3))
    model.summary()
    optimizer=optimizers.Adam(lr=1e-3)
    for epoch in range(50):
        for step,(x,y) in enumerate(train_data):
            with tf.GradientTape() as tape:
                logits=model(x)
                y_onehot=tf.one_hot(y,depth=10)
                loss=tf.losses.categorical_crossentropy(y_onehot,logits,from_logits=True)
                loss=tf.reduce_mean(loss)
            grads=tape.gradient(loss,model.trainable_variables)
            optimizer.apply_gradients(zip(grads,model.trainable_variables))
            if step%100==0:
                print(epoch,step,'loss',float(loss))
        total_num=0
        total_correct=0
        for x,y in test_data:
            logits=model(x)
            prob=tf.nn.softmax(logits,axis=1)
            pred=tf.argmax(prob,axis=1)
            pred=tf.cast(pred,dtype=tf.int32)
            correct=tf.cast(tf.equal(pred,y),dtype=tf.int32)
            correct=tf.reduce_sum(correct)
            total_num+=x.shape[0]
            total_correct+=int(correct)
        acc=total_correct/total_num
        print(epoch,'acc:',acc)
    print('训练结束')

if __name__ == '__main__':
    main()
  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

warrah

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值