Tensorflow中Conv2d使用numpy简易实现

最新推荐文章于 2024-06-26 14:10:25 发布

xyccstudio

最新推荐文章于 2024-06-26 14:10:25 发布

阅读量3.5k

点赞数 1

分类专栏：人工智能文章标签： Tensorflow Conv2D

本文链接：https://blog.csdn.net/xo19882011/article/details/79306641

版权

人工智能专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本文主要展示一下Tensorflow中的 tf.nn.conv2d具体是计算了什么~
先来看一下conv2d的定义

conv2d(
    input, # 一个4维张量，采用data_format格式映射每一维度的含义
    filter, # 卷积核，一个4维张量，每一维的含义是[filter_height, filter_width, in_channels, out_channels]
    strides, # 在图上每次移动的步长，一个4维张量，[1, stride_x ,stride_y ,1]=>[1, 横向移动步长, 纵向移动步长, 1]
    padding, # 填充，"SAME", "VALID"二选一， "SAME"表明卷积核只能在图像内部滑动，"VALID"表明卷积核要对每个移动到的点周围的向外扩散卷积核进行卷积，而不是在图像内部滑动。
    use_cudnn_on_gpu=True, 
    data_format='NHWC', # input数据格式，默认为NHWC即[batch, channels, height, width]=>[批大小, 高, 宽, 通道数]
    dilations=[1, 1, 1, 1], # 1.5之后引入，先忽略
    name=None 
)

经过conv2d之后的输出是一个4维张量，每一个维度的含义是 [batch, height_after, width_after, out_channels]，
其中 height_after 跟 width_after 的计算关系大致如下：

padding	height_after	width_after
SAME	$(height/stide_y)+1$	$(width/stide_x)+1$
VALID	$((height-filter_height)/stride_y)+1)$	$((width-filter_width)/stride_x)+1$

本文简易实现以padding=VALID为例，也有好多情况没有考虑~
计算公式大概如下所示：
$output[b, i, j, k] = \sum_{m=0}^b \sum_{i=0}^h \sum_{j=0}^w input[b, strides[1] * i + di, strides[2] * j + dj, q] *filter[di, dj, q, k]$
其中

字母	含义
m/b	batch 批个数
i/h	输出的高度
j/w	输出的宽度
p	输入的通道数
k	输出的通道数

实现之后大概如下

import numpy as np 

def conv2d(
    input,
    filter,
    strides,
    padding=None
):
    ish = input.shape 
    fsh = filter.shape 
    output = np.zeros([ish[0],(ish[1]-fsh[0])//strides[1]+1,(ish[2]-fsh[1])//strides[2]+1,fsh[3]])
    osh = output.shape
    for m in range(osh[0]):
        for i in range(osh[1]):
            for j in range(osh[2]):
                for di in range(fsh[0]):
                    for dj in range(fsh[1]):
                        t = np.dot(
                                input[m,strides[1]*i+di,strides[2]*j+dj,:],
                                filter[di,dj,:,:]
                            )
                        output[m,i,j] = np.sum(
                                [
                                    t,
                                    output[m,i,j]
                                ],
                                axis=0
                            )
    return output

举个例子：

# 先来个简单的
# input.shape = [1,3,3,1]
input = \
[[[0],  [1],  [2]],
 [[3],  [4],  [5]],
 [[6],  [7],  [8]]]
# filter.shape = [3,3,1,1]
filter = \
[[[[0]],  [[1]],  [[2]]],
 [[[3]],  [[4]],  [[5]]],
 [[[6]],  [[7]],  [[8]]]]
# strides = [1,1,1,1]
# 那么按照这个配置卷积核应该只能卷积1次，卷积操作对应d的input区域为整体input
# 计算过程为
input_filter3[0,0,:]*filter[0,0,:,:]+\ # 0*0
input_filter3[0,1,:]*filter[0,1,:,:]+\ # 1*1
...+ \
input_filter3[3,3,:]*filter[3,3,:,:] # 8*8
= 204
# 结果为
output = [[[[204.]]]]
output.shape = [1,1,1,1]
# 再来个复杂的
# input.shape = [1,5,5,3]
input = \
[[[ 0, 1, 2],  [ 3, 4, 5],  [ 6, 7, 8],  [ 9,10,11],  [12,13,14]],
 [[15,16,17],  [18,19,20],  [21,22,23],  [24,25,26],  [27,28,29]],
 [[30,31,32],  [33,34,35],  [36,37,38],  [39,40,41],  [42,43,44]],
 [[45,46,47],  [48,49,50],  [51,52,53],  [54,55,56],  [57,58,59]],
 [[60,61,62],  [63,64,65],  [66,67,68],  [69,70,71],  [72,73,74]]]
# filter.shape = [3,3,3,5]
filter = \
[[[[ 0],   [ 1],   [ 2]],
  [[ 3],   [ 4],   [ 5]],
  [[ 6],   [ 7],   [ 8]]],
 [[[ 9],   [10],   [11]],
  [[12],   [13],   [14]],
  [[15],   [16],   [17]]],
 [[[18],   [19],   [20]],
  [[21],   [22],   [23]],
  [[24],   [25],   [26]]]]
# strides.shape = [1,2,2,1]
# padding = 'VALID'
# 那么按照这个配置卷积核应该只能卷积4次，卷积操作对应的input区域为
input_filter0 = \
[[[ 0, 1, 2],  [ 3, 4, 5],  [ 6, 7, 8]],
 [[15,16,17],  [18,19,20],  [21,22,23]],
 [[30,31,32],  [33,34,35],  [36,37,38]]]
input_filter1 = \
[[[ 6, 7, 8],  [ 9,10,11],  [12,13,14]],
 [[21,22,23],  [24,25,26],  [27,28,29]],
 [[36,37,38],  [39,40,41],  [42,43,44]]]
input_filter2 = \
[[[30,31,32],  [33,34,35],  [36,37,38]],
 [[45,46,47],  [48,49,50],  [51,52,53]],
 [[60,61,62],  [63,64,65],  [66,67,68]]]
input_filter3 = \
[[[36,37,38],  [39,40,41],  [42,43,44]],
 [[51,52,53],  [54,55,56],  [57,58,59]],
 [[66,67,68],  [69,70,71],  [72,73,74]]]
# 然后就是这四个卷积核对应的跟filter相乘，并求和～
# 以input_filter0为例
input_filter3[0,0,:]*filter[0,0,:,:]+\ # 0*0+1*1+2*2
input_filter3[0,1,:]*filter[0,1,:,:]+\ # 3*3+4*4+5*5
...+ \
input_filter3[3,3,:]*filter[3,3,:,:] # 36*24+37*25+38*26
= 9279
# 最后整体可求的，结果为
[[[[ 9279.][11385.]]
  [[19809.][21915.]]]]
shape=[1,2,2,1]

代码地址：
https://github.com/xo1988/numpy_deeplearning/blob/master/conv2d.py

xyccstudio

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
打赏
1
评论
Tensorflow中Conv2d使用numpy简易实现

本文主要展示一下Tensorflow中的 tf.nn.conv2d具体是计算了什么~ 先来看一下conv2d的定义conv2d( input, # 一个4维张量，采用data_format格式映射每一维度的含义 filter, # 卷积核，一个4维张量，每一维的含义是[filter_height, filter_width, in_channels, out_channe...
复制链接

扫一扫