经典神经网络 -- VGG : 设计原理与pytorch实现

原理

       VGG网络结构:输入为224×224大小的RGB图像,首先经过两个3×3的卷积层 → 一个最大下采样层 → 两个3×3的卷积层 → 一个最大下采样层 → 三个3×3的卷积层 → 一个最大下采样层 → 三个3×3的卷积层 → 一个最大下采样层 → 三个3×3的卷积层 → 一个最大下采样层 → 三个全连接层 → soft-max处理得到概率分布

       网络中的亮点:通过堆叠多个3×3的卷积核来替代大尺度卷积核(减少所需参数),可以拥有相同的感受野
 
       论文中提到,可以通过:堆叠两个3×3的卷积核替代5×5的卷积核,堆叠三个3×3的卷积核替代7×7的卷积核

代码实现

# VGGNet Visual Geometry Group

import torch.nn as nn
import torch

# official pretrain weights
model_urls = {
    'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth',
    'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth',
    'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth',
    'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth'
}


class VGG(nn.Module):
    def __init__(self, features, num_classes=1000, init_w
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
以下是基于VGG16模型的Grad-CAM PyTorch实现代码: ```python import torch import torch.nn as nn import torch.nn.functional as F from torchvision import models from torchvision import transforms from PIL import Image class VGG16(nn.Module): def __init__(self): super(VGG16, self).__init__() self.features = models.vgg16(pretrained=True).features self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) self.classifier = nn.Sequential( nn.Linear(512 * 7 * 7, 4096), nn.ReLU(inplace=True), nn.Dropout(), nn.Linear(4096, 4096), nn.ReLU(inplace=True), nn.Dropout(), nn.Linear(4096, 1000), ) def forward(self, x): x = self.features(x) x = self.avgpool(x) x = torch.flatten(x, 1) x = self.classifier(x) return x class GradCAM: def __init__(self, model): self.model = model.eval() self.feature_maps = [] self.gradient_maps = [] # Registering hooks for feature maps and gradient maps self.model.features.register_forward_hook(self.save_feature_maps) self.model.features.register_backward_hook(self.save_gradient_maps) def save_feature_maps(self, module, input, output): # Save feature maps during forward pass self.feature_maps.append(output) def save_gradient_maps(self, module, grad_input, grad_output): # Save gradient maps during backward pass self.gradient_maps.append(grad_output[0]) def forward(self, x): return self.model(x) def backward(self, idx): # Calculate gradients of the output with respect to feature maps self.model.zero_grad() grad_output = torch.zeros_like(self.gradient_maps[-1]) grad_output[0][idx] = 1 self.gradient_maps[-1].backward(gradient=grad_output) def generate(self, x, idx): # Forward pass to get the predicted class self.forward(x) # Backward pass to get the gradients self.backward(idx) # Pool the gradients over the feature maps and normalize pooled_gradients = torch.mean(self.gradient_maps[-1], dim=[2, 3]) feature_maps = self.feature_maps[-1] for i in range(feature_maps.shape[1]): feature_maps[:, i, :, :] *= pooled_gradients[i] heatmap = torch.mean(feature_maps, dim=1).squeeze().detach().numpy() heatmap = np.maximum(heatmap, 0) heatmap /= np.max(heatmap) # Resize the heatmap to match the input image size heatmap = cv2.resize(heatmap, (x.shape[3], x.shape[2])) # Convert heatmap to RGB heatmap = np.uint8(255 * heatmap) heatmap = cv2.applyColorMap(heatmap, cv2.COLORMAP_JET) # Superimpose the heatmap on the input image superimposed_img = np.uint8(0.5 * x[0].permute(1, 2, 0).detach().numpy() + 0.5 * heatmap) return superimposed_img # Load the pre-trained VGG16 model model = VGG16() # Create GradCAM object gradcam = GradCAM(model) # Load the input image img = Image.open('input.jpg').convert('RGB') # Preprocess the input image transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) input_tensor = transform(img).unsqueeze(0) # Get the predicted class index output = gradcam.forward(input_tensor) predicted_idx = torch.argmax(output).item() # Generate the Grad-CAM heatmap cam = gradcam.generate(input_tensor, predicted_idx) # Save the output image output_img = Image.fromarray(cam) output_img.save('output.jpg') ``` 这段代码包括了VGG16模型的定义、Grad-CAM的实现输入图像的预处理以及结果图像的保存。你只需将`input.jpg`替换为你自己的输入图像,运行代码即可得到Grad-CAM可视化结果图像`output.jpg`。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值