torch.nn.Conv3d
Conv3d初始化Parameters:
- in_channels (int) – 输入图片的通道数
- out_channels (int) – 卷积计算后输出的通道数
- kernel_size (int or tuple) – 卷积核大小,如(3,7,7)
- stride (int or tuple, optional) – 步长,默认为1
- padding (int, tuple or str, optional) – Padding added to all six sides of the input. Default: 0
- padding_mode (string, optional) – ‘zeros’, ‘reflect’, ‘replicate’ or ‘circular’. Default: ‘zeros’
- dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1
- groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1
- bias (bool, optional) – If True, adds a learnable bias to the output. Default: True
输入/输出的shape:
- Input: ( N , C i n , D i n , H i n , W i n ) (N,C{_i}{_n},D{_i}{_n},H{_i}{_n},W{_i}{_n}) (N,Cin,Din,Hin,Win)
N N N- batch size
C i n C{_i}{_n} Cin - 输入图片的通道数
D i n D{_i}{_n} Din - 输入的帧数
H i n H{_i}{_n} Hin - 输入图片的高
W i n W{_i}{_n} Win - 输入图片的宽
- Output: ( N , C o u t , D o u t , H o u t , W o u t ) (N,C{_o}{_u}{_t},D{_o}{_u}{_t},H{_o}{_u}{_t},W{_o}{_u}{_t}) (N,Cout,Dout,Hout,Wout)
已知卷积的计算公式:
H
=
⌊
H
+
2
×
p
a
d
d
i
n
g
−
k
e
r
n
e
l
s
i
z
e
s
t
r
i
d
e
+
1
⌋
H =⌊ {{H +2×padding - kernelsize} \over stride} +1⌋
H=⌊strideH+2×padding−kernelsize+1⌋
计算各个维度的卷积后的长度:
D
o
u
t
=
⌊
D
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
s
i
z
e
[
0
]
−
1
)
−
1
s
t
r
i
d
e
[
0
]
+
1
⌋
D{_o}{_u}{_t} =⌊{{D{_i}{_n} +2×padding[0]−dilation[0]×(kernelsize[0]−1)−1 }\over stride[0]}+1⌋
Dout=⌊stride[0]Din+2×padding[0]−dilation[0]×(kernelsize[0]−1)−1+1⌋
H
o
u
t
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
1
]
−
d
i
l
a
t
i
o
n
[
1
]
×
(
k
e
r
n
e
l
s
i
z
e
[
1
]
−
1
)
−
1
s
t
r
i
d
e
[
1
]
+
1
⌋
H{_o}{_u}{_t} =⌊{{H{_i}{_n} +2×padding[1]−dilation[1]×(kernelsize[1]−1)−1 }\over stride[1]}+1⌋
Hout=⌊stride[1]Hin+2×padding[1]−dilation[1]×(kernelsize[1]−1)−1+1⌋
W
o
u
t
=
⌊
W
i
n
+
2
×
p
a
d
d
i
n
g
[
2
]
−
d
i
l
a
t
i
o
n
[
2
]
×
(
k
e
r
n
e
l
s
i
z
e
[
2
]
−
1
)
−
1
s
t
r
i
d
e
[
2
]
+
1
⌋
W{_o}{_u}{_t} =⌊{{W{_i}{_n} +2×padding[2]−dilation[2]×(kernelsize[2]−1)−1 }\over stride[2]}+1⌋
Wout=⌊stride[2]Win+2×padding[2]−dilation[2]×(kernelsize[2]−1)−1+1⌋
Demo示例
import torch
import torch.nn as nn
from torch import autograd
# kernel_size的第哥一维度的值是每次处理的图像帧数,后面是卷积核的大小
# 输入/出通道数
# 卷积核大小
# 步长/补位
m = nn.Conv3d(3, 3, (3, 7, 7), stride=1, padding=0)
# 输入:
# batch_size
# Cin 通道数
# 提取时序特征的帧数
# h/w
input = autograd.Variable(torch.randn(1, 3, 7, 60, 40))
# 卷积计算推导:
# w1=(w0+2pad-kernel_size)/stride+1;
print(input.shape)
output = m(input)
print(output.size())
# 输出是 torch.Size([1, 3, 5, 54, 34])