行为识别模型R( 2+1)D的模型结构 R(2+1)D代码链接与论文 R(2+1)D模型结构 SpatioTemporalConv模块结构 SpatioTemporalResLayer模块结构 SpatioTemporalResBlock模块结构 R(2+1)D代码链接与论文 链接: https://github.com/jfzhang95/pytorch-video-recognition. 论文:《A Closer Look at Spatiotemporal Convolutions for Action Recognition》 建议对照代码看 R(2+1)D模型结构 R(2+1)D模型结构图 block_type=SpatioTemporalResBlock layer_size=[2,2,2,2] Created with Raphaël 2.3.0 inputs:(N,3,16,112,112) SpatioTemporalConv:(3,64,(1,7,7),stride=(1,2,2),padding=(0,3,3),first_conv=True) ------------------------------------outputs:(N,64,16,56,56)----------------------------------------- SpatioTemporalResLayer:(64,64,3,layer_size[0],block_type=block_type) -----------------------------outputs:(N,64,16,56,56)----------------------------------- SpatioTemporalResLayer:(64,128,3,layer_size[1],block_type=block_type,downsample=True) ------------------------------------------outputs:(N,128,8,28,28)------------------------------------------------- SpatioTemporalResLayer:(128,256,3,layer_size[2],block_type=block_type,downsample=True) ------------------------------------------outputs:(N,256,4,14,14)------------------------------------------------- SpatioTemporalResLayer:(256,512,3,layer_size[3],block_type=block_type,downsample=True) ------------------------------------------outputs:(N,512,2,7,7)------------------------------------------------- AdaptiveAvgPool3d outputs:(N,512,1,1,1) View(-1,512) outputs:(N,512) Softmax Max Index outputs:(N,1)