代码:link
class ResNet
的forward
,source code
值得注意的是,self.maxpool
中ks=3
ResNet34
(224, 224, 3)
→【self.conv1, Cout=64, ks=7, s=2, p=3】→(112, 112, 64)
→【self.bn1】→【self.relu】→(112, 112, 64)
→【self.maxpool, ks=3, s=2, p=1】→(56, 56, 64)
→【self.layer1】→(56, 56, 64)
→【self.layer2】→(28, 28, 128)
→【self.layer3】→(14, 14, 256)
→【self.layer4】→(7, 7, 512)
→【self.avgpool, Adaptive, output_size=(1, 1)】→(1, 1, 512)
→【reshape】→(512,)→【self.fc】→(1000,)
self.layer1中包含2个Identity Block,每个Identity Block的结构如下
(56, 56, 64)
分支1:→【Conv, Cout=64, ks=3, s=1, p=1】→【bn】→【relu】→(56, 56, 64)
→【Conv, Cout=64, ks=3, s=1, p=1】→【bn】→(56, 56, 64)
分支2:→【identity maping】→(56, 56, 64)
合并add:→(56, 56, 64)→【relu】→(56, 56, 64)
self.layer2中包含1个Conv Block和3个Identity Block,其中Conv Block的结构如下
(56, 56 ,64)
分支1:→【Conv, Cout=128, ks=3, s=2, p=1】→【bn】→【relu】→(28, 28, 128)
→【Conv, Cout=128, ks=3, s=1, p=1】→【bn】→(28, 28, 128)
分支2(downsample):→【Conv, Cout=128, ks=1, s=2】→【bn】→(28, 28, 128)
合并add:→(28, 28, 128)→【relu】→(28, 28, 128)
ResNet50
(224, 224, 3)
→【self.conv1, Cout=64, ks=7, s=2, p=3】→(112, 112, 64)
→【self.bn1】→【self.relu】→(112, 112, 64)
→【self.maxpool, ks=3, s=2, p=1】→(56, 56, 64)
→【self.layer1】→(56, 56, 256)
→【self.layer2】→(28, 28, 512)
→【self.layer3】→(14, 14, 1024)
→【self.layer4】→(7, 7, 2048)
→【self.avgpool, Adaptive, output_size=(1, 1)】→(1, 1, 2048)
→【reshape】→(2048,)→【self.fc】→(1000,)
self.layer1的第一个Block比较特殊,空间维度不变,通道数从64增加到256
(56, 56 ,64)
分支1:→【Conv, Cout=64, ks=1, s=1】→【bn】→【relu】→(56, 56, 64)
→【Conv, Cout=64, ks=3, s=1, p=1】→【bn】→【relu】→(56, 56, 64)
→【Conv, Cout=256, ks=1, s=1】→【bn】→(56, 56, 256)
分支2:→【Conv, Cout=256, ks=1, s=1】→【bn】→(56, 56, 256)
合并add:→(56, 56, 256)→【relu】→(56, 56, 256)
然后是2个Bottleneck型的Identity Block,结构如下
(56, 56, 256)
分支1:→【Conv, Cout=64, ks=1, s=1】→【bn】→【relu】→(56, 56, 64)
→【Conv, Cout=64, ks=3, s=1, p=1】→【bn】→【relu】→(56, 56, 64)
→【Conv, Cout=256, ks=1, s=1】→【bn】→(56, 56, 256)
分支2:→【identity maping】→(56, 56, 256)
合并add:→(56, 56, 256)→【relu】→(56, 56, 256)
self.layer2中包含1个Bottleneck型的Conv Block和3个Bottleneck型的Identity Block
其中Bottleneck型的Conv Block的结构如下
(56, 56, 256)
分支1:→【Conv, Cout=128, ks=1, s=1】→【bn】→【relu】→(56, 56, 128)
→【Conv, Cout=128, ks=3, s=2, p=1】→【bn】→【relu】→(28, 28, 128)
→【Conv, Cout=512, ks=1, s=1】→【bn】→(28, 28, 512)
分支2(downsample):→【Conv, Cout=512, ks=1, s=2】→【bn】→(28, 28, 512)
合并add:→(28, 28, 512)→【relu】→(28, 28, 512)
Bottleneck型的Identity Block的结构如下
(28, 28, 512)
分支1:→【Conv, Cout=128, ks=1, s=1】→【bn】→【relu】→(28, 28, 128)
→【Conv, Cout=128, ks=3, s=1, p=1】→【bn】→【relu】→(28, 28, 128)
→【Conv, Cout=512, ks=1, s=1】→【bn】→(28, 28, 512)
分支2:→【identity maping】→(28, 28, 512)
合并add:→(28, 28, 512)→【relu】→(28, 28, 512)
第2个参数指定通道数,实际的通道数还要乘上block.expansion(BasicBlock.expansion=1, Bottleneck.expansion=4)
stride=2表示堆叠的第1个Block为
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=2)