Resnet&API

最新推荐文章于 2023-10-22 21:05:43 发布

人工智能有点

最新推荐文章于 2023-10-22 21:05:43 发布

阅读量557

点赞数

分类专栏： AI之旅文章标签：深度学习 keras

本文链接：https://blog.csdn.net/weixin_44417441/article/details/126077235

版权

AI之旅专栏收录该内容

18 篇文章 8 订阅

订阅专栏

Resnet

vgg是一个纯串行网络
resnet是目前最常用的神经网络之一

residual net（残差网络）：将靠前的某一层数据输出直接跳过多层引入到后面的数据层作为输入部分。

残差神经单元：输入为x，假设输出为H(x)，此时将输入x传到输出作为结果，这时残差神经单元学习的F(x)相当于是H(x)-x，即F(x)=H(x)-x。所以学习目标改变了，学习的是F(x)，即残差。

在这里插入图片描述

传统网络vgg和alexnet都是串行，串行一定会有信息丢失。丢掉的信息基本都是无用的，但也不排除会丢失有用信息（减小h，w）。所以Resnet通过把输入传到输出可以在一定程度上缓解信息丢失的问题。

缓解问题的残差结构被称作shortcut或skip connections。如下图所示。左侧串行结构的卷积进行降采样的同时，右侧的shortcut也应该进行降采样到相同的shape，因为两侧输出要做加法。

在这里插入图片描述

Resnet50

Resnet有两个基本的块，Conv Block和Identity Block。Conv Block的输入输出维度不同，相同的Conv Block不能串联。Identity Block的输入输出维度相同，可以串联。实际使用中，可以一个Conv Block和几个Identity Block串联提取特征，同时还能减少信息丢失。

Conv Block的左侧属于串行网络提取特征，配合右侧（输入x降采样，为了能相加）形成残差。

Identity Block仅是左侧为串行结构，右侧什么也没有。可以保证输入输出同维度。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-7JRzZRO9-1659183757818)(2022-07-29-11-46-14.png)]

Resnet50代码

Input

Input() 用于实例化 Keras 张量。

Keras 张量是底层后端(Theano, TensorFlow 或 CNTK) 的张量对象，我们增加了一些特性，使得能够通过了解模型的输入和输出来构建 Keras 模型。

例如，如果 a, b 和 c 都是 Keras 张量，那么以下操作是可行的： model = Model(input=[a, b], output=c)

添加的 Keras 属性是：

_keras_shape: 通过 Keras端的尺寸推理进行传播的整数尺寸元组。
_keras_history: 应用于张量的最后一层。整个网络层计算图可以递归地从该层中检索。

参数

shape: 一个尺寸元组（整数），不包含批量大小。例如，shape=(32,) 表明期望的输入是按批次的 32 维向量。
batch_shape: 一个尺寸元组（整数），包含批量大小。例如，batch_shape=(10, 32) 表明期望的输入是 10 个 32 维向量。 batch_shape=(None, 32) 表明任意批次大小的 32 维向量。
name: 一个可选的层的名称的字符串。在一个模型中应该是唯一的（不可以重用一个名字两次）。如未提供，将自动生成。
dtype: 输入所期望的数据类型，字符串表示 (float32, float64, int32…)
sparse: 一个布尔值，指明需要创建的占位符是否是稀疏的。
tensor: 可选的可封装到 Input 层的现有张量。如果设定了，那么这个层将不会创建占位符张量。

一个张量。

from keras.layers import Input
from keras.layers import Dense
from keras.models import Model
x = Input(shape=(32,))
y = Dense(16, activation='softmax')(x)
model = Model(x, y)

ZeroPadding2D

2D 输入的零填充层（例如图像）。

该图层可以在图像张量的顶部、底部、左侧和右侧添加零表示的行和列。

参数

padding: 整数，或 2 个整数的元组，或 2 个整数的 2 个元组。
如果为整数：以上、下、左、右对称的方式填充0。
如果为 2 个整数的元组：第一个整数表示上下对称的方式填充0；第二个整数表示左右对称的方式填充0。
如果为 2 个整数的 2 个元组：解释为 ((top_pad, bottom_pad), (left_pad, right_pad))。
data_format: 字符串， channels_last (默认) 或 channels_first 之一，表示输入中维度的顺序。channels_last 对应输入尺寸为 (batch, height, width, channels)， channels_first 对应输入尺寸为 (batch, channels, height, width)。它默认为从 Keras 配置文件 ~/.keras/keras.json 中找到的 image_data_format 值。如果你从未设置它，将使用 “channels_last”。

输入尺寸

如果 data_format 为 “channels_last”，输入 4D 张量，尺寸为 (n, h, w, c)。
如果 data_format 为 “channels_first”，输入 4D 张量，尺寸为 (batch, channels, rows, cols)。

输出尺寸

如果 data_format 为 “channels_last”，输出 4D 张量，尺寸为 (batch, padded_rows, padded_cols, channels)。
如果 data_format 为 “channels_first”，输出 4D 张量，尺寸为 (batch, channels, padded_rows, padded_cols)。

Conv2D

2D 卷积层 (例如对图像的空间卷积)。

该层创建了一个卷积核，该卷积核对层输入进行卷积，以生成输出张量。如果 use_bias 为 True，则会创建一个偏置向量并将其添加到输出中。最后，如果 activation 不是 None，它也会应用于输出。

当使用该层作为模型第一层时，需要提供 input_shape 参数（整数元组，不包含样本表示的轴），例如， input_shape=(128, 128, 3) 表示 128x128 RGB 图像，在 data_format=“channels_last” 时。

参数

filters: 整数，输出空间的维度（即卷积中滤波器的输出数量）。
kernel_size: 一个整数，或者 2 个整数表示的元组或列表，指明 2D 卷积窗口的宽度和高度。可以是一个整数，为所有空间维度指定相同的值。
strides: 一个整数，或者 2 个整数表示的元组或列表，指明卷积沿宽度和高度方向的步长。可以是一个整数，为所有空间维度指定相同的值。指定任何 stride 值 != 1 与指定 dilation_rate 值 != 1 两者不兼容。
padding: “valid” 或 “same” (大小写敏感)。
data_format: 字符串， channels_last (默认) 或 channels_first 之一，表示输入中维度的顺序。 channels_last 对应输入尺寸为 (batch, height, width, channels)， channels_first 对应输入尺寸为 (batch, channels, height, width)。它默认为从 Keras 配置文件 ~/.keras/keras.json 中找到的 image_data_format 值。如果你从未设置它，将使用 channels_last。
dilation_rate: 一个整数或 2 个整数的元组或列表，指定膨胀卷积的膨胀率。可以是一个整数，为所有空间维度指定相同的值。当前，指定任何 dilation_rate 值 != 1 与指定 stride 值 != 1 两者不兼容。
activation: 要使用的激活函数 (详见 activations)。如果你不指定，则不使用激活函数 (即线性激活： a(x) = x)。
use_bias: 布尔值，该层是否使用偏置向量。
kernel_initializer: kernel 权值矩阵的初始化器 (详见 initializers)。
bias_initializer: 偏置向量的初始化器 (详见 initializers)。
kernel_regularizer: 运用到 kernel 权值矩阵的正则化函数 (详见 regularizer)。
bias_regularizer: 运用到偏置向量的正则化函数 (详见 regularizer)。
activity_regularizer: 运用到层输出（它的激活值）的正则化函数 (详见 regularizer)。
kernel_constraint: 运用到 kernel 权值矩阵的约束函数 (详见 constraints)。
bias_constraint: 运用到偏置向量的约束函数 (详见 constraints)。

输入尺寸

如果 data_format=‘channels_first’，输入 4D 张量，尺寸为 (samples, channels, rows, cols)。
如果 data_format=‘channels_last’，输入 4D 张量，尺寸为 (samples, rows, cols, channels)。

输出尺寸

如果 data_format=‘channels_first’，输出 4D 张量，尺寸为 (samples, filters, new_rows, new_cols)。
如果 data_format=‘channels_last’，输出 4D 张量，尺寸为 (samples, new_rows, new_cols, filters)。
由于填充的原因， rows 和 cols 值可能已更改。

代码

from keras.layers import Conv2D

BatchNormalization

from keras.layers import BatchNormalization

批量标准化层 (Ioffe and Szegedy, 2014)。

在每一个批次的数据中标准化前一层的激活项，即，应用一个维持激活项平均值接近 0，标准差接近 1 的转换。

参数

axis: 整数，需要标准化的轴（通常是特征轴）。例如，在 data_format="channels_first" 的 Conv2D 层之后，在 BatchNormalization 中设置 axis=1。
momentum: 移动均值和移动方差的动量。
epsilon: 增加到方差的小的浮点数，以避免除以零。
center: 如果为 True，把 beta 的偏移量加到标准化的张量上。如果为 False， beta 被忽略。
scale: 如果为 True，乘以 gamma。如果为 False，gamma 不使用。当下一层为线性层（或者例如 nn.relu），这可以被禁用，因为缩放将由下一层完成。
beta_initializer: beta 权重的初始化方法。
gamma_initializer: gamma 权重的初始化方法。
moving_mean_initializer: 移动均值的初始化方法。
moving_variance_initializer: 移动方差的初始化方法。
beta_regularizer: 可选的 beta 权重的正则化方法。
gamma_regularizer: 可选的 gamma 权重的正则化方法。
beta_constraint: 可选的 beta 权重的约束方法。
gamma_constraint: 可选的 gamma 权重的约束方法。

输入尺寸

可以是任意的。如果将这一层作为模型的第一层，则需要指定 input_shape 参数（整数元组，不包含样本数量的维度）。

输出尺寸

与输入相同。

Activation

将激活函数应用于输出。

参数

activation: 要使用的激活函数的名称 (详见: activations)，或者选择一个 Theano 或 TensorFlow 操作。

输入尺寸

任意尺寸。当使用此层作为模型中的第一层时，使用参数 input_shape （整数元组，不包括样本数samples）。

输出尺寸

与输入相同。

代码

from keras.layers import Activation
x = Activation('relu')(x)

MaxPooling2D

from keras.layers import MaxPooling2D

对于空间数据的最大池化。

参数

pool_size: 整数，或者 2 个整数表示的元组，沿（垂直，水平）方向缩小比例的因数。（2，2）会把输入张量的两个维度都缩小一半。如果只使用一个整数，那么两个维度都会使用同样的窗口长度。
strides: 整数，2 个整数表示的元组，或者是 None。表示步长值。如果是 None，那么默认值是 pool_size。
padding: “valid” 或者 “same” （区分大小写）。
data_format: 字符串，channels_last (默认)或 channels_first 之一。表示输入各维度的顺序。 channels_last 代表尺寸是 (batch, height, width, channels) 的输入张量，而 channels_first 代表尺寸是 (batch, channels, height, width) 的输入张量。默认值根据 Keras 配置文件 ~/.keras/keras.json 中的 image_data_format 值来设置。如果还没有设置过，那么默认值就是 “channels_last”。

输入尺寸

如果 data_format=‘channels_last’: 尺寸是 (batch_size, rows, cols, channels) 的 4D 张量
如果 data_format=‘channels_first’: 尺寸是 (batch_size, channels, rows, cols) 的 4D 张量

输出尺寸

如果 data_format=‘channels_last’: 尺寸是 (batch_size, pooled_rows, pooled_cols, channels) 的 4D 张量
如果 data_format=‘channels_first’: 尺寸是 (batch_size, channels, pooled_rows, pooled_cols) 的 4D 张量

AveragePooling2D

对于空间数据的平均池化。

参数

pool_size: 整数，或者 2 个整数表示的元组，沿（垂直，水平）方向缩小比例的因数。（2，2）会把输入张量的两个维度都缩小一半。如果只使用一个整数，那么两个维度都会使用同样的窗口长度。
strides: 整数，2 个整数表示的元组，或者是 None。表示步长值。如果是 None，那么默认值是 pool_size。
padding: “valid” 或者 “same” （区分大小写）。
data_format: 字符串，channels_last (默认)或 channels_first 之一。表示输入各维度的顺序。 channels_last 代表尺寸是 (batch, height, width, channels) 的输入张量，而 channels_first 代表尺寸是 (batch, channels, height, width) 的输入张量。默认值根据 Keras 配置文件 ~/.keras/keras.json 中的 image_data_format 值来设置。如果还没有设置过，那么默认值就是 “channels_last”。

输入尺寸

如果 data_format=‘channels_last’: 尺寸是 (batch_size, rows, cols, channels) 的 4D 张量
如果 data_format=‘channels_first’: 尺寸是 (batch_size, channels, rows, cols) 的 4D 张量

输出尺寸

如果 data_format=‘channels_last’: 尺寸是 (batch_size, pooled_rows, pooled_cols, channels) 的 4D 张量
如果 data_format=‘channels_first’: 尺寸是 (batch_size, channels, pooled_rows, pooled_cols) 的 4D 张量

Flatten

将输入展平。不影响批量大小。

参数

data_format：一个字符串，其值为 channels_last（默认值）或者 channels_first。它表明输入的维度的顺序。此参数的目的是当模型从一种数据格式切换到另一种数据格式时保留权重顺序。channels_last 对应着尺寸为 (batch, …, channels) 的输入，而 channels_first 对应着尺寸为 (batch, channels, …) 的输入。默认为 image_data_format 的值，你可以在 Keras 的配置文件 ~/.keras/keras.json 中找到它。如果你从未设置过它，那么它将是 channels_last
例

from keras.layers import Conv2D
from keras.layers import Flatten
from keras.models import Model

model = Sequential()
model.add(Conv2D(64, (3, 3),
                 input_shape=(3, 32, 32), padding='same',))
# 现在：model.output_shape == (None, 64, 32, 32)

model.add(Flatten())
# 现在：model.output_shape == (None, 65536)

Dense

from keras.layers import Dense

就是你常用的的全连接层。

Dense 实现以下操作： output = activation(dot(input, kernel) + bias) 其中 activation 是按逐个元素计算的激活函数，kernel 是由网络层创建的权值矩阵，以及 bias 是其创建的偏置向量 (只在 use_bias 为 True 时才有用)。

注意: 如果该层的输入的秩大于2，那么它首先被展平然后再计算与 kernel 的点乘。

例

# 作为 Sequential 模型的第一层
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# 现在模型就会以尺寸为 (*, 16) 的数组作为输入，
# 其输出数组的尺寸为 (*, 32)

# 在第一层之后，你就不再需要指定输入的尺寸了：
model.add(Dense(32))

参数

units: 正整数，输出空间维度。
activation: 激活函数 (详见 activations)。若不指定，则不使用激活函数 (即，「线性」激活: a(x) = x)。
use_bias: 布尔值，该层是否使用偏置向量。
kernel_initializer: kernel 权值矩阵的初始化器 (详见 initializers)。
bias_initializer: 偏置向量的初始化器 (see initializers).
kernel_regularizer: 运用到 kernel 权值矩阵的正则化函数 (详见 regularizer)。
bias_regularizer: 运用到偏置向的的正则化函数 (详见 regularizer)。
activity_regularizer: 运用到层的输出的正则化函数 (它的 “activation”)。 (详见 regularizer)。
kernel_constraint: 运用到 kernel 权值矩阵的约束函数 (详见 constraints)。
bias_constraint: 运用到偏置向量的约束函数 (详见 constraints)。

输入尺寸

nD 张量，尺寸: (batch_size, …, input_dim)。最常见的情况是一个尺寸为 (batch_size, input_dim) 的 2D 输入。

输出尺寸

nD 张量，尺寸: (batch_size, …, units)。例如，对于尺寸为 (batch_size, input_dim) 的 2D 输入，输出的尺寸为 (batch_size, units)。

Model

在函数式 API 中，给定一些输入张量和输出张量，可以通过以下方式实例化一个 Model：

from keras.models import Model
from keras.layers import Input, Dense

a = Input(shape=(32,))
b = Dense(32)(a)
model = Model(inputs=a, outputs=b)

这个模型将包含从 a 到 b 的计算的所有网络层。

在多输入或多输出模型的情况下，你也可以使用列表：

model = Model(inputs=[a1, a2], outputs=[b1, b3, b3])

有关 Model 的详细介绍，请阅读 Keras 函数式 API 指引。

代码

直接加载已经训练好的模型

# -------------------------------------------------------------#
#   ResNet50的网络部分
# -------------------------------------------------------------#
from __future__ import print_function

import numpy as np
from keras import layers

from keras.layers import Input
from keras.layers import Dense, Conv2D, MaxPooling2D, ZeroPadding2D, AveragePooling2D
from keras.layers import Activation, BatchNormalization, Flatten
from keras.models import Model

from keras.preprocessing import image
from keras.applications.imagenet_utils import decode_predictions
from keras.applications.imagenet_utils import preprocess_input


def identity_block(input_tensor, kernel_size, filters, stage, block):
    filters1, filters2, filters3 = filters

    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)
    x = BatchNormalization(name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters2, kernel_size, padding='same', name=conv_name_base + '2b')(x)

    x = BatchNormalization(name=bn_name_base + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
    x = BatchNormalization(name=bn_name_base + '2c')(x)

    x = layers.add([x, input_tensor])
    x = Activation('relu')(x)
    return x


def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
    filters1, filters2, filters3 = filters

    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    x = Conv2D(filters1, (1, 1), strides=strides, name=conv_name_base + '2a')(input_tensor)
    x = BatchNormalization(name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters2, kernel_size, padding='same', name=conv_name_base + '2b')(x)
    x = BatchNormalization(name=bn_name_base + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
    x = BatchNormalization(name=bn_name_base + '2c')(x)

    shortcut = Conv2D(filters3, (1, 1), strides=strides, name=conv_name_base + '1')(input_tensor)
    shortcut = BatchNormalization(name=bn_name_base + '1')(shortcut)

    x = layers.add([x, shortcut])       # 返回列表中张量的和
    x = Activation('relu')(x)
    return x


def ResNet50(input_shape=[224, 224, 3], classes=1000):
    img_input = Input(shape=input_shape)        # 实例化Keras张量 shape指一个张量大小（不包括批）
    x = ZeroPadding2D((3, 3))(img_input)        # 填充padding 上下各3、左右各3, x==>(230,230,3)

    x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x)     # x==>(113,113,64)  (230-7)/2+1=112.5=[112]
    x = BatchNormalization(name='bn_conv1')(x)
    x = Activation('relu')(x)       # relu激活
    x = MaxPooling2D((3, 3), strides=(2, 2))(x)     # x==>(55,55,64)   (112-3)/2+1=55.5=55

    x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')

    x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')

    x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')

    x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')

    x = AveragePooling2D((7, 7), name='avg_pool')(x)

    x = Flatten()(x)
    x = Dense(classes, activation='softmax', name='fc1000')(x)

    model = Model(img_input, x, name='resnet50')

    model.load_weights("resnet50_weights_tf_dim_ordering_tf_kernels.h5")

    return model


if __name__ == '__main__':
    model = ResNet50()
    model.summary()
    img_path = 'elephant.jpg'
    # img_path = 'bike.jpg'
    img = image.load_img(img_path, target_size=(224, 224))
    print(img)      # <PIL.Image.Image image mode=RGB size=224x224 at 0x2BDEE51E320>
    print(type(img))        # <class 'PIL.Image.Image'>
    print(img.format, img.size, img.mode)       # None (224, 224) RGB
    x = image.img_to_array(img)


    print(type(x))      # <class 'numpy.ndarray'>
    print(x.shape)      ## (224, 224, 3)
    x = np.expand_dims(x, axis=0)
    print(x.shape)      ## (1, 224, 224, 3)
    x = preprocess_input(x)
    # print(x)

    print('Input image shape:', x.shape)
    preds = model.predict(x)
    # print("preds:", preds)      # (None,1000)
    print(preds.size, preds.shape)      # 1000 (1, 1000)
    print(type(preds))      # <class 'numpy.ndarray'>
    print('Predicted:', decode_predictions(preds))      # 根据模型概率输出,打印前几个概率大的(class_name, class_description, score) vgg中有自定义类似的

友情连接（https://keras.io/zh/）
x.shape) ## (1, 224, 224, 3)
x = preprocess_input(x)
# print(x)

print('Input image shape:', x.shape)
preds = model.predict(x)
# print("preds:", preds)      # (None,1000)
print(preds.size, preds.shape)      # 1000 (1, 1000)
print(type(preds))      # <class 'numpy.ndarray'>
print('Predicted:', decode_predictions(preds))      # 根据模型概率输出,打印前几个概率大的(class_name, class_description, score) vgg中有自定义类似的

友情连接（https://keras.io/zh/）

人工智能有点

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Resnet&API

当使用该层作为模型第一层时，需要提供input_shape参数（整数元组，不包含样本表示的轴），例如，input_shape=(128,128,3)表示128x128RGB图像，在data_format=“channels_last”时。残差神经单元输入为x，假设输出为H(x)，此时将输入x传到输出作为结果，这时残差神经单元学习的F(x)相当于是H(x)-x，即F(x)=H(x)-x。最常见的情况是一个尺寸为(batch_size,input_dim)的2D输入。...
复制链接

扫一扫