【Deep Learning】torch.nn.Conv2d用法及filter和kernel的区别

最新推荐文章于 2022-12-11 16:22:18 发布

CrazyCoder1992

最新推荐文章于 2022-12-11 16:22:18 发布

阅读量3.5k

点赞数 2

分类专栏： AI

本文链接：https://blog.csdn.net/codeman_cdb/article/details/101383830

版权

AI 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

这几天学习pytorch时发现torch.nn.Conv2d需要指定out_channels，对此感到疑惑。而tensorflow时只需要指定input和filter就行了，因为之前以为filter是二维的，对每一个feature map都会输出一个新的feature map，所以有out_channels = in_channels * filter_count，而pytorch中需要同时指定in_channels，out_channels和kernel_size，无法保证out_channels是in_channels的整数倍，所以对此感到疑惑。

阅读tensorflow的源码发现filter并不是二维的，源码中对filter的解释是：

filter: A `Tensor`. Must have the same type as `input`.
      A 4-D tensor of shape
      `[filter_height, filter_width, in_channels, out_channels]`

tensorflow.nn.conv2d中filter变量是四维的，也要同时指定in_channels和out_channels，这和我之前对卷积网络的filter的理解不一致。重新查阅了卷积网络的一些资料，原来参与卷积运算的filter并不是2维的，假如input是3维的，那么filter也必须是3维的，每一个filter(过滤器)只会对input卷积出来一个feature map，而不是像我之前以为的一个filter过滤器(维数为2)对input的每一层都会卷积出来一个feature map。

所以tensorflow.nn.conv2d的正确理解应该是：

input：[batch_size, feature_height, feature_width, in_channels ]

batch_size是一次训练的批量大小，也就是一次输入多少张图片，中间二项是特征图的宽高，in_channels是层数，如灰度图片层数为1，RGB图片的层数为3。

filter: [filter_height, filter_width, in_channels, out_channels]

前二项是特征图宽高，in_channels是层数，要和input的层数保持一致，否则执行会报错，out_channels是输出的层数，其实也就是参与卷积的过滤器的数量。

理解了tensorflow.nn.conv2d再看torch.nn.Conv2d的定义：

'''
	...
	kernel_size (int or tuple): Size of the convolving kernel
	...
'''
def __init__(self, in_channels, out_channels, kernel_size, stride=1,
					 padding=0, dilation=1, groups=1,
					 bias=True, padding_mode='zeros')

kernel_size实际上是filter的宽高，即[filter_height, filter_width]，in_channels是层数，out_channels是输出层数，也就是filter的数量，和tensorflow的含义是一样的。

CrazyCoder1992

关注

2
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
【Deep Learning】torch.nn.Conv2d用法及filter和kernel的区别

这几天学习pytorch时发现torch.nn.Conv2d需要指定out_channels，对此感到疑惑。而tensorflow时只需要指定input和filter就行了，因为之前以为filter是二维的，对每一个feature map都会输出一个新的feature map，所以有out_channels = in_channels * filter_count，而pytorch中需要同时指定i...
复制链接

扫一扫