Faster RCNN-4(训练过程和网络搭建)

1.Faster RCNN的训练过程

4-Step Alternating Training. In this paper, we adopt a pragmatic 4-step training algorithm to learn shared features via alternating optimization. In the first step, we train the RPN as described in Section 3.1.3. This network is initialized with an ImageNet-pre-trained model and fine-tuned end-to-end for the region proposal task. In the second step, we train a separate detection network by Fast R-CNN using the proposals generated by the step-1 RPN. This detection network is also initialized by the ImageNet-pre-trained model. At this point the two networks do not share convolutional layers. In the third step, we use the detector network to initialize RPN training, but we
fix the shared convolutional layers and only fine-tune the layers unique to RPN. Now the two networks share convolutional layers. Finally, keeping the shared convolutional layers fixed, we fine-tune the unique layers of Fast R-CNN. As such, both networks share the same convolutional layers and form a unified network. A similar alternating training can be run for more iterations, but we have observed negligible improvements.

Faster R-CNN的训练,是在已经训练好的model的基础上继续进行训练。实际中训练过程分为4个步骤:

  1. 在已经训练好的model上,训练RPN网络,对应stage1_rpn_train.pt
    利用步骤1中训练好的RPN网络,收集proposals,对应rpn_test.pt
  2. 第一次训练Fast RCNN网络,对应stage1_fast_rcnn_train.pt
  3. 第二训练RPN网络,对应stage2_rpn_train.pt
    再次利用步骤3中训练好的RPN网络,收集proposals,对应rpn_test.pt
  4. 第二次训练Fast RCNN网络,对应stage2_fast_rcnn_train.pt

上面的四个步骤每个步骤都对应着一个.pt文件,每个pt文件都有所有的卷积层。这里应用一张别人画的流程图:
在这里插入图片描述
还有一种是end-to-end训练,网络只有一种,写下train.prototxt里面(caffe)

2.Faster RCNN的网络搭建

来看一下pytorch中faster rcnn的网络结构是怎么搭建的。
在这里插入图片描述下面这段代码用来初始化网络结构,在trainval_net.py里面.

 # initilize the network here.
 if args.net == 'vgg16':
   fasterRCNN = vgg16(imdb.classes, pretrained=True, class_agnostic=args.class_agnostic)
 elif args.net == 'res101':
   fasterRCNN = resnet(imdb.classes, 101, pretrained=True, class_agnostic=args.class_agnostic)
 elif args.net == 'res50':
   fasterRCNN = resnet(imdb.classes, 50, pretrained=True, class_agnostic=args.class_agnostic)
 elif args.net == 'res152':
   fasterRCNN = resnet(imdb.classes, 152, pretrained=True, class_agnostic=args.class_agnostic)
 else:
   print("network is not defined")
   pdb.set_trace()
   
 fasterRCNN.create_architecture()

以vgg为例,首先初始化vgg16

class vgg16(_fasterRCNN):
  def __init__(self, classes, pretrained=False, class_agnostic=False):
    self.model_path = 'data/pretrained_model/vgg16_caffe.pth'
    self.dout_base_model = 512
    self.pretrained = pretrained
    self.class_agnostic = class_agnostic

    _fasterRCNN.__init__(self, classes, class_agnostic)

vgg16是从fasterRCNN类中继承的,同时调用了父类的构造函数

class _fasterRCNN(nn.Module):
    """ faster RCNN """
    def __init__(self, classes, class_agnostic):
        super(_fasterRCNN, self).__init__()
        self.classes = classes
        self.n_classes = len(classes)
        self.class_agnostic = class_agnostic
        # loss
        self.RCNN_loss_cls = 0
        self.RCNN_loss_bbox = 0

        # define rpn
        self.RCNN_rpn = _RPN(self.dout_base_model)
        self.RCNN_proposal_target = _ProposalTargetLayer(self.n_classes)
        self.RCNN_roi_pool = _RoIPooling(cfg.POOLING_SIZE, cfg.POOLING_SIZE, 1.0/16.0)
        self.RCNN_roi_align = RoIAlignAvg(cfg.POOLING_SIZE, cfg.POOLING_SIZE, 1.0/16.0)

        self.grid_size = cfg.POOLING_SIZE * 2 if cfg.CROP_RESIZE_WITH_MAX_POOL else cfg.POOLING_SIZE
        self.RCNN_roi_crop = _RoICrop()

父类的构造函数定义了rpn,ROIPooling层。
之后执行fasterRCNN.create_architecture(),这个函数在faster_rcnn.py中

    def create_architecture(self):
        self._init_modules()
        self._init_weights()

而_init_modules在vgg16.py里面这段代码定义了前面的卷积层和后面的全连接层,也就是下图的1,2(3是rpn,在之前定义过)
在这里插入图片描述

  def _init_modules(self):
    vgg = models.vgg16()
    if self.pretrained:
        print("Loading pretrained weights from %s" %(self.model_path))
        state_dict = torch.load(self.model_path)
        vgg.load_state_dict({k:v for k,v in state_dict.items() if k in vgg.state_dict()})

    vgg.classifier = nn.Sequential(*list(vgg.classifier._modules.values())[:-1])

    # not using the last maxpool layer
    self.RCNN_base = nn.Sequential(*list(vgg.features._modules.values())[:-1])

    # Fix the layers before conv3:
    for layer in range(10):
      for p in self.RCNN_base[layer].parameters(): p.requires_grad = False

    # self.RCNN_base = _RCNN_base(vgg.features, self.classes, self.dout_base_model)

    self.RCNN_top = vgg.classifier

    # not using the last maxpool layer
    self.RCNN_cls_score = nn.Linear(4096, self.n_classes)

    if self.class_agnostic:
      self.RCNN_bbox_pred = nn.Linear(4096, 4)
    else:
      self.RCNN_bbox_pred = nn.Linear(4096, 4 * self.n_classes)   

随后在定义一下学习率、优化方法、是否使用之前的节点、是否用多GPU、是否用tensorboard等等,然后就可以开始训练了。

for epoch in range(args.start_epoch, args.max_epochs + 1):
    # setting to train mode
    fasterRCNN.train()
    loss_temp = 0
    start = time.time()

    if epoch % (args.lr_decay_step + 1) == 0:
        adjust_learning_rate(optimizer, args.lr_decay_gamma)
        lr *= args.lr_decay_gamma

    data_iter = iter(dataloader)
    for step in range(iters_per_epoch):
      data = next(data_iter)
      im_data.data.resize_(data[0].size()).copy_(data[0])
      im_info.data.resize_(data[1].size()).copy_(data[1])
      gt_boxes.data.resize_(data[2].size()).copy_(data[2])
      num_boxes.data.resize_(data[3].size()).copy_(data[3])

      fasterRCNN.zero_grad()
      rois, cls_prob, bbox_pred, \
      rpn_loss_cls, rpn_loss_box, \
      RCNN_loss_cls, RCNN_loss_bbox, \
      rois_label = fasterRCNN(im_data, im_info, gt_boxes, num_boxes)

      loss = rpn_loss_cls.mean() + rpn_loss_box.mean() \
           + RCNN_loss_cls.mean() + RCNN_loss_bbox.mean()
      loss_temp += loss.item()

      # backward
      optimizer.zero_grad()
      loss.backward()
      if args.net == "vgg16":
          clip_gradient(fasterRCNN, 10.)
      optimizer.step()

每一个epoch计算4个loss,然后用loss.backward反向传播,optimizer进行优化。和之前写过的mnist的训练很相似。

# begin to training
for epoch in range(EPOCH):
  for step, (b_x, b_y) in enumerate(train_loder):
      output = net(b_x)[0]
      # print(output)
      loss = loss_func(output, b_y)
      optimizer.zero_grad()
      loss.backward()
      optimizer.step()

但是论文里写faster RCNN的训练分成了四步,这里好像是参考的py-faster-rcnn中end-to-end网络结构。

在这里插入图片描述 在这里插入图片描述

  • 3
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
以下是使用PyTorch搭建Faster R-CNN训练网络模型的代码: ```python import torch import torchvision import torchvision.transforms as transforms import torch.optim as optim from torch.utils.data import DataLoader # Set device to GPU if available device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") # Define transforms for the data transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ]) # Load the dataset trainset = torchvision.datasets.CocoDetection(root='./data', annFile='./annotations/instances_train2017.json', transform=transform) trainloader = DataLoader(trainset, batch_size=32, shuffle=True, num_workers=2) # Define the model model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True) model.to(device) # Define the optimizer and loss function params = [p for p in model.parameters() if p.requires_grad] optimizer = optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1) criterion = torchvision.models.detection.fasterrcnn_loss # Train the model num_epochs = 10 for epoch in range(num_epochs): running_loss = 0.0 for i, data in enumerate(trainloader, 0): # Get the inputs and labels inputs, targets = data[0].to(device), [{k: v.to(device) for k, v in t.items()} for t in data[1]] # Zero the parameter gradients optimizer.zero_grad() # Forward pass outputs = model(inputs, targets) loss = sum(criterion(out, targ) for out, targ in zip(outputs, targets)) # Backward pass loss.backward() optimizer.step() # Print statistics running_loss += loss.item() if i % 100 == 99: print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 100)) running_loss = 0.0 # Update the learning rate lr_scheduler.step() print('Finished Training') ``` 在这个代码中,我们使用了PyTorch内置的`torchvision`库来加载COCO数据集,并使用`torch.utils.data.DataLoader`来创建数据加载器。我们还使用了`torchvision.models.detection.fasterrcnn_resnet50_fpn`来定义Faster R-CNN模型,并使用`torch.optim.SGD`作为优化器,以及`torch.optim.lr_scheduler.StepLR`来动态调整学习率。最后,我们使用了`torchvision.models.detection.fasterrcnn_loss`作为损失函数,并在训练循环中使用了标准的前向传递、后向传递和优化步骤。 需要注意的是,这个代码只是一个简单的示例,如果要在实际项目中使用,需要根据具体的需求进行修改和优化。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值