Tensorflow中Conv2d使用numpy简易实现

本文主要展示一下Tensorflow中的 tf.nn.conv2d具体是计算了什么~
先来看一下conv2d的定义

conv2d(
    input, # 一个4维张量,采用data_format格式映射每一维度的含义
    filter, # 卷积核,一个4维张量,每一维的含义是[filter_height, filter_width, in_channels, out_channels]
    strides, # 在图上每次移动的步长,一个4维张量,[1, stride_x ,stride_y ,1]=>[1, 横向移动步长, 纵向移动步长, 1]
    padding, # 填充,"SAME", "VALID"二选一, "SAME"表明卷积核只能在图像内部滑动,"VALID"表明卷积核要对每个移动到的点周围的向外扩散卷积核进行卷积,而不是在图像内部滑动。
    use_cudnn_on_gpu=True, 
    data_format='NHWC', # input数据格式,默认为NHWC即[batch, channels, height, width]=>[批大小, 高, 宽, 通道数]
    dilations=[1, 1, 1, 1], # 1.5之后引入,先忽略
    name=None 
)

经过conv2d之后的输出是一个4维张量,每一个维度的含义是 [batch, height_after, width_after, out_channels]
其中 height_after 跟 width_after 的计算关系大致如下:

paddingheight_afterwidth_after
SAME (height/stidey)+1 ( h e i g h t / s t i d e y ) + 1 (width/stidex)+1 ( w i d t h / s t i d e x ) + 1
VALID ((heightfilterheight)/stridey)+1) ( ( h e i g h t − f i l t e r h e i g h t ) / s t r i d e y ) + 1 ) ((widthfilterwidth)/stridex)+1 ( ( w i d t h − f i l t e r w i d t h ) / s t r i d e x ) + 1

本文简易实现以padding=VALID为例,也有好多情况没有考虑~
计算公式大概如下所示:
output[b,i,j,k]=bm=0hi=0wj=0input[b,strides[1]i+di,strides[2]j+dj,q]filter[di,dj,q,k] o u t p u t [ b , i , j , k ] = ∑ m = 0 b ∑ i = 0 h ∑ j = 0 w i n p u t [ b , s t r i d e s [ 1 ] ∗ i + d i , s t r i d e s [ 2 ] ∗ j + d j , q ] ∗ f i l t e r [ d i , d j , q , k ]
其中

字母含义
m/bbatch 批个数
i/h输出的高度
j/w输出的宽度
p输入的通道数
k输出的通道数

实现之后大概如下

import numpy as np 

def conv2d(
    input,
    filter,
    strides,
    padding=None
):
    ish = input.shape 
    fsh = filter.shape 
    output = np.zeros([ish[0],(ish[1]-fsh[0])//strides[1]+1,(ish[2]-fsh[1])//strides[2]+1,fsh[3]])
    osh = output.shape
    for m in range(osh[0]):
        for i in range(osh[1]):
            for j in range(osh[2]):
                for di in range(fsh[0]):
                    for dj in range(fsh[1]):
                        t = np.dot(
                                input[m,strides[1]*i+di,strides[2]*j+dj,:],
                                filter[di,dj,:,:]
                            )
                        output[m,i,j] = np.sum(
                                [
                                    t,
                                    output[m,i,j]
                                ],
                                axis=0
                            )
    return output

举个例子:

# 先来个简单的
# input.shape = [1,3,3,1]
input = \
[[[0],  [1],  [2]],
 [[3],  [4],  [5]],
 [[6],  [7],  [8]]]
# filter.shape = [3,3,1,1]
filter = \
[[[[0]],  [[1]],  [[2]]],
 [[[3]],  [[4]],  [[5]]],
 [[[6]],  [[7]],  [[8]]]]
# strides = [1,1,1,1]
# 那么按照这个配置卷积核应该只能卷积1次,卷积操作对应d的input区域为整体input
# 计算过程为
input_filter3[0,0,:]*filter[0,0,:,:]+\ # 0*0
input_filter3[0,1,:]*filter[0,1,:,:]+\ # 1*1
...+ \
input_filter3[3,3,:]*filter[3,3,:,:] # 8*8
= 204
# 结果为
output = [[[[204.]]]]
output.shape = [1,1,1,1]
# 再来个复杂的
# input.shape = [1,5,5,3]
input = \
[[[ 0, 1, 2],  [ 3, 4, 5],  [ 6, 7, 8],  [ 9,10,11],  [12,13,14]],
 [[15,16,17],  [18,19,20],  [21,22,23],  [24,25,26],  [27,28,29]],
 [[30,31,32],  [33,34,35],  [36,37,38],  [39,40,41],  [42,43,44]],
 [[45,46,47],  [48,49,50],  [51,52,53],  [54,55,56],  [57,58,59]],
 [[60,61,62],  [63,64,65],  [66,67,68],  [69,70,71],  [72,73,74]]]
# filter.shape = [3,3,3,5]
filter = \
[[[[ 0],   [ 1],   [ 2]],
  [[ 3],   [ 4],   [ 5]],
  [[ 6],   [ 7],   [ 8]]],
 [[[ 9],   [10],   [11]],
  [[12],   [13],   [14]],
  [[15],   [16],   [17]]],
 [[[18],   [19],   [20]],
  [[21],   [22],   [23]],
  [[24],   [25],   [26]]]]
# strides.shape = [1,2,2,1]
# padding = 'VALID'
# 那么按照这个配置卷积核应该只能卷积4次,卷积操作对应的input区域为
input_filter0 = \
[[[ 0, 1, 2],  [ 3, 4, 5],  [ 6, 7, 8]],
 [[15,16,17],  [18,19,20],  [21,22,23]],
 [[30,31,32],  [33,34,35],  [36,37,38]]]
input_filter1 = \
[[[ 6, 7, 8],  [ 9,10,11],  [12,13,14]],
 [[21,22,23],  [24,25,26],  [27,28,29]],
 [[36,37,38],  [39,40,41],  [42,43,44]]]
input_filter2 = \
[[[30,31,32],  [33,34,35],  [36,37,38]],
 [[45,46,47],  [48,49,50],  [51,52,53]],
 [[60,61,62],  [63,64,65],  [66,67,68]]]
input_filter3 = \
[[[36,37,38],  [39,40,41],  [42,43,44]],
 [[51,52,53],  [54,55,56],  [57,58,59]],
 [[66,67,68],  [69,70,71],  [72,73,74]]]
# 然后就是这四个卷积核对应的跟filter相乘,并求和~
# 以input_filter0为例
input_filter3[0,0,:]*filter[0,0,:,:]+\ # 0*0+1*1+2*2
input_filter3[0,1,:]*filter[0,1,:,:]+\ # 3*3+4*4+5*5
...+ \
input_filter3[3,3,:]*filter[3,3,:,:] # 36*24+37*25+38*26
= 9279
# 最后整体可求的,结果为
[[[[ 9279.][11385.]]
  [[19809.][21915.]]]]
shape=[1,2,2,1]

代码地址:
https://github.com/xo1988/numpy_deeplearning/blob/master/conv2d.py

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

xyccstudio

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值