FLOPs衡量神将网络中前向传播的运算次数,FLOPs越小,则计算速度越快。
参数数量衡量神经网络中所包含参数的数量,参数数量越小,模型体积越小,更容易部署,二者计算方法如下所示:
参数数量更容易计算,只需要衡量,神经网络中有多少个待定参数即可,以卷积神经网络为例,假设卷积核大小为k × k,输入通道为 i,输出通道为o,输出图像尺寸为t ×t,则:
每进行一次卷积操作,或一个卷积层,需要的参数数量为:
k×k×i×o + o 后面的加是加上bias,可看下图conv1为例:
注意 conv1的卷积核大小为11×11,可自行计算出相同结果
在考虑FLOPs时,同样以卷积计算为例,可参考下述代码计算过程:
input_shape = (3,300,300) # Format:(channels, rows,cols)
conv_filter = (64,3,3,3) # Format: (num_filters, channels, rows, cols)
stride = 1
padding = 1
activation = 'relu'
n = conv_filter[1] * conv_filter[2] * conv_filter[3] # vector_length
flops_per_instance = n + (n-1) # general defination for number of flops (n: multiplications and n-1: additions)
num_instances_per_filter = (( input_shape[1] - conv_filter[2] + 2*padding) / stride ) + 1 # for rows
num_instances_per_filter *= (( input_shape[1] - conv_filter[2] + 2*padding) / stride ) + 1 # multiplying with cols
flops_per_filter = num_instances_per_filter * flops_per_instance
total_flops_per_layer = flops_per_filter * conv_filter[0] # multiply with number of filters
if activation == 'relu':
# Here one can add number of flops required
# Relu takes 1 comparison and 1 multiplication
# Assuming for Relu: number of flops equal to length of input vector
total_flops_per_layer += conv_filter[0]*input_shape[1]*input_shape[2]
print(total_flops_per_layer)
上式中,与下列计算结果等效:
即,没进行一次卷积,计算量大小如下:
k×k×i×t×t×o + (k×k×i-1)×t×t×o