### FPN(Feature Pyramid Network)模型 示例代码 实现
以下是基于 PyTorch 的 Feature Pyramid Network (FPN) 模型的一个简单实现示例。此代码展示了如何构建一个多尺度特征金字塔结构,并将其应用于目标检测任务。
```python
import torch
import torch.nn as nn
import torchvision.models as models
class FPN(nn.Module):
def __init__(self, out_channels=256):
super(FPN, self).__init__()
# 使用 ResNet-50 作为 Backbone 提取基础特征
resnet = models.resnet50(pretrained=True)
self.layer1 = nn.Sequential(resnet.conv1, resnet.bn1, resnet.relu, resnet.maxpool, resnet.layer1)
self.layer2 = resnet.layer2
self.layer3 = resnet.layer3
self.layer4 = resnet.layer4
# Lateral connections to reduce channel size
self.latlayer1 = nn.Conv2d(2048, out_channels, kernel_size=1, stride=1, padding=0)
self.latlayer2 = nn.Conv2d(1024, out_channels, kernel_size=1, stride=1, padding=0)
self.latlayer3 = nn.Conv2d(512, out_channels, kernel_size=1, stride=1, padding=0)
# Top-down pathway and smooth layers
self.toplayer = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
self.smooth1 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
self.smooth2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
def _upsample_add(self, x, y):
"""Upsample tensor `y` to match the spatial dimensions of `x`, then add them together."""
_, _, H, W = x.size()
return torch.nn.functional.interpolate(y, size=(H,W), mode='bilinear', align_corners=False) + x
def forward(self, x):
c1 = self.layer1(x) # Output shape: [B, 256, H/4, W/4]
c2 = self.layer2(c1) # Output shape: [B, 512, H/8, W/8]
c3 = self.layer3(c2) # Output shape: [B, 1024, H/16, W/16]
c4 = self.layer4(c3) # Output shape: [B, 2048, H/32, W/32]
p4 = self.latlayer1(c4) # Apply lateral connection on C4
p3 = self._upsample_add(p4, self.latlayer2(c3)) # Upsample P4 and add it with C3 after applying a lateral layer.
p2 = self._upsample_add(p3, self.latlayer3(c2)) # Repeat process for C2.
p4 = self.toplayer(p4) # Smooth top-layer output using convolutional smoothing operation.
p3 = self.smooth1(p3) # Smooth intermediate outputs similarly.
p2 = self.smooth2(p2) # Final smoothed result at highest resolution level.
return p2, p3, p4 # Return multi-scale feature maps from different levels of pyramid structure.
# Example usage:
if __name__ == "__main__":
model = FPN(out_channels=256).cuda() # Initialize FPN module with specified number of channels per map.
input_tensor = torch.randn((1, 3, 224, 224)).cuda() # Create dummy batched image data tensor shaped appropriately.
features = model(input_tensor) # Pass through network; get list containing three tensors corresponding to each scale's processed version.
print([f.shape for f in features]) # Print shapes just so we can verify correctness visually here too! Should see something like [(1,256,H,W)] where height & width depend upon original input size divided by powers-of-two according to respective stages within architecture design itself...
```
上述代码实现了基本的 FPN 结构,其中使用了 ResNet-50 作为骨干网络来提取多级特征图[^1]。通过横向连接和自顶向下的路径,该架构可以生成具有相同通道数但分辨率不同的多个特征图,从而支持多尺度目标检测需求[^4]。
#### 关键点解释
- **Backbone**: 这里选择了预训练好的 ResNet-50 模型作为主干网路,负责提取原始输入图片的基础特征[^3]。
- **Lateral Connections**: 利用卷积操作减少高层特征图的维度至统一尺寸以便后续融合处理。
- **Top-Down Pathway**: 将低分辨率高语义级别的特征逐步插值放大并与更高分辨率较低层次的信息相结合形成最终输出。