飞桨PaddlePaddle图像分割七日打卡营之FCN学习心得
一、卷积神经网络
卷积神经网络通常包含以下几种层:
• 卷积层(Convolutional layer),卷积神经网路中每层卷积层由若干卷积单元组成,每个卷积单元的参数都是通过反向传播算法优化得到的。卷积运算的目的是提取输入的不同特征,第一层卷积层可能只能提取一些低级的特征如边缘、线条和角等层级,更多层的网络能从低级特征中迭代提取更复杂的特征。
• 线性整流层(Rectified Linear Units layer, ReLU layer),这一层神经的活性化函数(Activation function)使用线性整流(Rectified Linear Units, ReLU)f(x)=max(0,x)f(x)=max(0,x)。
• 池化层(Pooling layer),通常在卷积层之后会得到维度很大的特征,将特征切成几个区域,取其最大值或平均值,得到新的、维度较小的特征。
• 全连接层( Fully-Connected layer), 把所有局部特征结合变成全局特征,用来计算最后每一类的得分。
二、FCN网络
1.什么是FCN网络
注:从图像分类网络→图像分割
2.如何实现FCN网络
通过上采样;反卷积;上池化使图片尺寸变大
反卷积:将滑窗方式变成矩阵相乘
3.FCN网络结构
三、FCN网络实现代码
import numpy as np
import paddle.fluid as fluid
from paddle.fluid.dygraph import to_variable
from paddle.fluid.dygraph import Conv2D
from paddle.fluid.dygraph import Conv2DTranspose
from paddle.fluid.dygraph import Dropout
from paddle.fluid.dygraph import BatchNorm
from paddle.fluid.dygraph import Pool2D
from paddle.fluid.dygraph import Linear
from vgg import VGG16BN
class FCN8s(fluid.dygraph.Layer):
# TODO: create fcn8s model
def __init__(self,num_classes=59):
super(FCN8s,self).__init__()
backbone=VGG16BN(pretrained=False)
self.layer1 = backbone.layer1
self.layer1[0].conv._padding = [100,100]
self.pool1 = Pool2D(pool_size=2,pool_stride=2,ceil_mode=True)
self.layer2 = backbone.layer2
self.pool2 = Pool2D(pool_size=2,pool_stride=2,ceil_mode=True)
self.layer3 = backbone.layer3
self.pool3 = Pool2D(pool_size=2,pool_stride=2,ceil_mode=True)
self.layer4 = backbone.layer4
self.pool4 = Pool2D(pool_size=2,pool_stride=2,ceil_mode=True)
self.layer5 = backbone.layer5
self.pool5 = Pool2D(pool_size=2,pool_stride=2,ceil_mode=True)
self.fc6 = Conv2D(512,4096,7,act='relu')
self.fc7 = Conv2D(4096,4096,1,act='relu')
self.drop6 = Dropout()
self.drop7 = Dropout()
self.score = Conv2D(4096,num_classes,1)
self.score_pool3 = Conv2D(256,num_classes,1)
self.score_pool4 = Conv2D(512,num_classes,1)
self.up_output = Conv2DTranspose(num_channels=num_classes,
num_filters=num_classes,
filter_size=4,
stride =2,
bias_attr=False)
self.up_pool4 = Conv2DTranspose(num_channels=num_classes,
num_filters=num_classes,
filter_size=4,
stride =2,
bias_attr=False)
self.up_final = Conv2DTranspose(num_channels=num_classes,
num_filters=num_classes,
filter_size=16,
stride =16,
bias_attr=False)
def forward(self,inputs):
x=self.layer1(inputs)
x=self.pool1(x)
x=self.layer2(x)
x=self.pool2(x)
x=self.layer3(x)
x=self.pool3(x)
pool3= x
x=self.layer4(x)
x=self.pool4(x)
pool4= x
x=self.layer5(x)
x=self.pool5(x)
x=self.fc6(x)
x=self.drop6(x)
x=self.fc7(x)
x=self.drop7(x)
x=self.score(x)
x=self.up_output(x)
up_output = x
x=self.score_pool4(pool4)
x = x [:,:,5:5+up_output.shape[2],5:5+up_output.shape[3]]
up_pool4 =x
x=up_pool4 + up_output
x=self.score_pool3(pool3)
x=x[:,:,9:9+up_pool4.shape[2],9:9+up_pool4.shape[3]]
up_pool3=x
x= up_pool3 +up_pool4
x= self.up_final(x)
x=x[:,:,31:31+inputs.shape[2],31:31+inputs.shape[3]]
return x
def main():
with fluid.dygraph.guard():
x_data = np.random.rand(2, 3, 512, 512).astype(np.float32)
x = to_variable(x_data)
model = FCN8s(num_classes=59)
model.eval()
pred = model(x)
print(pred.shape)
if __name__ == '__main__':
main()
四、课程链接
课程链接:https://aistudio.baidu.com/aistudio/course/introduce/1767