SSD project

object detection algorithm


Last time,we find out a problem that we can not identify a flame only by its RGB and HSI feature as well as use the Background segmentation algorithm and designed filter in the video real-time fire detection.
So today let’s go through the SSD object detections algorithm combining the high accuracy of faster-rcnn algorithm and the fast detection speed of yolo algorithm.

SSD

Firstly,the overall framework of SSD is as follows
在这里插入图片描述
As the figure above shows,the SSD use its backbone network VGG16 and some extra convolutional layers to extract features of input picture.
Backbone:VGG16
在这里插入图片描述
And the implementation of the overall construction is as followes:

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import os

base = [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',
            512, 512, 512]
            
def vgg(i):
    layers = []
    in_channels = i
    for v in base:
        if v == 'M':
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        elif v == 'C':
            layers += [nn.MaxPool2d(kernel_size=2, stride=2, ceil_mode=True)]
        else:
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            layers += [conv2d, nn.ReLU(inplace=True)]
            in_channels = v
    pool5 = nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
    conv6 = nn.Conv2d(512, 1024, kernel_size=3, padding=6, dilation=6)
    conv7 = nn.Conv2d(1024, 1024, kernel_size=1)
    layers += [pool5, conv6,
               nn.ReLU(inplace=True), conv7, nn.ReLU(inplace=True)]

    return layers
base = {
    '300': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M',
            512, 512, 512],
}
    
extras = {
    '300': [256, 'S', 512, 128, 'S', 256, 128, 256, 128, 256],
}
    
def add_extras(cfg, i, batch_norm=False):
    # Extra layers added to VGG for feature scaling
    layers = []
    in_channels = i
    flag = False
    for k, v in enumerate(cfg):
        if in_channels != 'S':
            if v == 'S':
                layers += [nn.Conv2d(in_channels, cfg[k + 1],
                           kernel_size=(1, 3)[flag], stride=2, padding=1)]
            else:
                layers += [nn.Conv2d(in_channels, v, kernel_size=(1, 3)[flag])]
            flag = not flag
        in_channels = v
    return layers

If you can not figure out the size of feature map after convolutional layer,you can refer to this article.
nn.Conv2d
在这里插入图片描述
After constructing the overall framework of the SSD,let’s have a try to detect the objection using the open source project https://github.com/bubbliiiing/ssd-pytorch.
Then I will share some tips about how to run this project to perform the object detection.
Before training, put the tag file in the Annotation under the VOC2007 folder (under the VOCdevkit folder).and place the image files in JPEGImages in the VOC2007 folder (under the VOCdevkit folder).
在这里插入图片描述
在这里插入图片描述
Use voc2ssd.py file to generate corresponding txt before training
在这里插入图片描述
before running the voc_annotation.py file ,you should change the classes to your own dataset classes.
and the corresponding 2007_train.txt will be generated, each line corresponds to its picture position and the position of its real box
在这里插入图片描述
在这里插入图片描述
If your tag file(xml file) have some chinese word or invalid symbol,you may meet some problem like this.
在这里插入图片描述
It is high recommended that you should change your code in the voc_annotation.py file as follows:
在这里插入图片描述
Lastly,replace the classes in the voc_classes.txt with your designed classes and alter the value of the num_classes to one more than the number of categories
在这里插入图片描述
在这里插入图片描述
Then run the train.py file to train your network.Let’s run the predict.py file to see the result.
在这里插入图片描述
If you want to detect the fire in the video,how about trying this code.

from keras.layers import Input
from ssd import SSD
from PIL import Image
import cv2
import numpy as np
ssd = SSD()
cap=cv2.VideoCapture('test2.mp4')
while True:
    ret,frame=cap.read()#读取视频
    #BGR转RGB格式
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    frame = Image.fromarray(np.uint8(frame))
    frame = np.array(ssd.detect_image(frame))
    frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
    cv2.imshow('frame',frame)
    k=cv2.waitKey(40)&0xFF
    if k==27:
        cap.release()
        break

All the materials are used to study , if this article involves all infringement, please contact me to delete
Reference:
https://blog.csdn.net/weixin_44791964/article/details/104839376
https://github.com/bubbliiiing/ssd-pytorch.
nn.Conv2d

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值