python算法专项(十一)——ResNet网络搭建,抽取图片语义向量

13 篇文章 1 订阅
10 篇文章 0 订阅

参考链接:
1、ResNet介绍
2、resnet50结构图

根据上面参考链接有两处纠错,如下图:
1、介绍的表格里,最下面平均值池化应该是7*7才对,否则验证的时候结果输出不正确

在这里插入图片描述
2、resnet50网络结构图的时候第一层应该是56*56
在这里插入图片描述

resnet主要有resnet18、resnet34、resnet50、resnet101、resnet502 ,5种结构,最常用Resnet50,其他的结构也可以尝试。根据这两篇博客描述的进行搭建网络模型,代码如下:

# *_* coding : UTF-8 *_*
# Author  :  ZPH 复现Fang Teacher
# Creat Data  :  2021/4/10  21:46
# Project Name  :  my_resnet.PY
import tensorflow as tf

MODE_RESNET18 = "resnet18"#2,2,2,2
MODE_RESNET34 = "resnet34"#3,4,6,3   building_block

MODE_RESNET50 = "resnet50"#3,4,6,3   bottleneck
MODE_RESNET101 = "resnet101"#3,4,23,3
MODE_RESNET152 = "resnet152"#3,8,36,3
def building_block(input,init_filters,resize,name):
    with tf.variable_scope(name):
        with tf.variable_scope("left"):
            left = tf.layers.conv2d(input,init_filters,3,2 if resize else 1,"same",name="Conv1")
            left = tf.layers.batch_normalization(left,training=True)
            left = tf.nn.relu(left)

            left = tf.layers.conv2d(left,init_filters,3,1,"same",name="Conv2")
            left = tf.layers.batch_normalization(left,training=True)
        with tf.variable_scope("right"):
            if resize or input.shape[3].value != init_filters:
                right = tf.layers.conv2d(input,init_filters,3,2 if resize else 1,"same",name="Conv1")
                right = tf.layers.batch_normalization(right,training=True)
            else:
                right = input

    return tf.nn.relu(left+right)



def bottlenck(input,init_filters,resize,name):
    with tf.variable_scope(name):
        with tf.variable_scope("left"):
            #1*1
            left = tf.layers.conv2d(input, init_filters, 1, 2 if resize else 1, "same", name="Conv1")
            left = tf.layers.batch_normalization(left, training=True)
            left = tf.nn.relu(left)
            #3*3
            left = tf.layers.conv2d(left, init_filters, 3, 1, "same", name="Conv2")
            left = tf.layers.batch_normalization(left, training=True)
            left = tf.nn.relu(left)
            # 1*1 ,最后一层通道数变成了4倍
            init_filters *= 4
            left = tf.layers.conv2d(left, init_filters, 1, 1, "same", name="Conv3")
            left = tf.layers.batch_normalization(left, training=True)

        with tf.variable_scope("right"):
            if resize or input.shape[3].value != init_filters:
                right = tf.layers.conv2d(input, init_filters, 3, 2 if resize else 1, "same", name="Conv1")
                right = tf.layers.batch_normalization(right, training=True)
            else:
                right = input

    return tf.nn.relu(left + right)

_setting = {
    MODE_RESNET18:((2,2,2,2),building_block),
    MODE_RESNET34:((3,4,6,3),building_block),
    MODE_RESNET50:((3,4,6,3),bottlenck),
    MODE_RESNET101:((3,4,23,3),bottlenck),
    MODE_RESNET152:((3,4,36,3),bottlenck),
}
def _check(input):#输入数据,形状是标准的卷积样例[-1,244,244,channel]
    _,height,width,_ = input.shape
    height = height.value
    if height %32 !=0:
        raise Exception("The height of the input must be times of 32")
    width = width.value
    if width %32 !=0:
        raise Exception("The width of the input must be time of 32")
    return height,width

_name_id = 0
def resnet(input,mode,logit_size,name=None):
    '''
    使用resnet进行抽取特征向量
    :param input: 输入数据,形状是标准的卷积样例[-1,244,244,channel],给出的样例,输入是224*224
    :param mode:MODE_RESNET152、MODE_RESNET101、MODE_RESNET50、MODE_RESNET34、MODE_RESNET18
    :param logit_size:输出的向量长度,自定义
    :param name:
    :return:[-1,logit_size]
    '''
    height,width = _check(input)#数据有效性检查
    if name is None:
        global _name_id
        name = "resnet_%d"%_name_id
        _name_id +=1

    with tf.variable_scope(name):
        base_size = (height//32,width//32)#(7,7)
        input = tf.layers.conv2d(input,64,base_size,2,"same",activation=tf.nn.relu,name="conv1")
        input = tf.layers.max_pooling2d(input,3,2,"same")#
        init_filter = 64#初始通道数
        module = _setting[mode][1]#制定resnet的模型函数
        for ord,repeats in enumerate(_setting[mode][0]):#[3,4,6,3]
            for i in range(repeats):#对里面每一个元素进行循环构建
                resize = i==0and ord !=0#布尔类型,是否要进行resize
                input = module(input,init_filter,resize,"Conv%d_%d"%(ord,i+1))
            init_filter *= 2
        input = tf.layers.average_pooling2d(input,base_size,1,"valid")
        semantics = tf.reshape(input,[-1,input.shape[3].value])
        logit = tf.layers.dense(semantics,logit_size,name="dense")#[-1,logit_size]
        return logit

if __name__ == '__main__':
    a = tf.random_normal([10,224,224,3])#随机模拟生成10个样本,224*224的图片,3通道
    b = resnet(a,MODE_RESNET50,100)
    print(b.shape)#输出的语义向量应该是[10,100]

在这里插入图片描述
其实还有很多可以优化的地方可以测试,如:
1、在卷积的时候采用步长为2进行降采样,可以写成步长均为1的计算,保证计算的充足性。然后在Maxpooling的时候进行下采样,把maxpooling 步长2可以设置为4(理论上该方法与卷积进行步长为2和maxpooling步长为2是等同的效果,但结果应该更优,需要实验验证才可得出这个结论)
2、maxpooling窗口大小可以设置22或者55…根据具体情形测试
3、理论上说输入的数据是32的倍数就行,同时在最后拉平的时候需要注意参数调整(这是我猜测的,也需要验证)

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值