A级课程--- 从零开始(二)---python:tensorflow实现ssd目标检测算法 1-1 & tf.pad()

https://www.bilibili.com/video/av43996494

高清原图

è¿éåå¾çæè¿°

VGG修改图示意

图上信息讲解

 

1 大体结构

vgg前5层保留,后面是三个全连接层
现在是VGG前5层,后面6-11是卷积层
VGG基础上加了+ conv6,conv7,conv8,conv9,conv10,conv11
提取特征的feature map是conv4 conv7 conv8 conv9 conv10 conv11

 

2 详细讲解第67层

输入是300*300 FC6 FC7换成了3*3的 conv6 conv7
并把pool5(s=2步长,池化核也是2)变成了  s=1 pool_size=3*3
第六层带孔卷积层
,使用空洞卷积(扩张率是6)来进行视野扩张(类似膨胀)
使用带孔卷积的原因:为了不减少feature map size
 

移除dropout层,FC8, 增加conv7 conv8 conv9 conv10 conv11 , 然后在检测数据集上面做fine-turning

SSD采集:提取特征的feature map是conv4 conv7 conv8 conv9 conv10 conv11  6个卷积层

因为conv4靠前, featuremap的norm很大,PS:会用一个l2的归一化进行处理
需要让conv4 通过12 nrom减少norm (仅对channel这个维度归一化),这时需要加一个可训练放缩量(gamma)

之后提取的feature map size是 (38,38)(19,19)(10,10)(5,5)(3,3)(1,1)

空洞卷积说明:
扩张率分别为1, 2, 3   
?=?第三个存疑

锚点对应的正方形先验框说明:

3 tf相关代码解释

因为第四层比较靠前,进行l2归一化,只对channel这个维度归一化

4 7 8 9  10 11这几层提取的特征之后,进行两次卷积:
一次是得到类别
一次得到锚点框的位置

不会用到配置函数 ,
def __init__(self):
    pass了
PS:应该是构造函数吧

tf.layers.conv2d()
与tf.nn.conv2d()的区别

tf.layers.conv2d需要自己定义bias之类的

tf.nn.conv2d参数少了很多

tf.nn.conv2d

参考博客:https://blog.csdn.net/zuolixiangfisher/article/details/80528989

方法定义
tf.nn.conv2d (input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)

参数:
**input : ** 输入的要做卷积的图片,要求为一个张量,shape为 [ batch, in_height, in_weight, in_channel ],其中batch为图片的数量,in_height 为图片高度,in_weight 为图片宽度,in_channel 为图片的通道数,灰度图该值为1,彩色图为3。(也可以用其它值,但是具体含义不是很理解)
filter: 卷积核,要求也是一个张量,shape为 [ filter_height, filter_weight, in_channel, out_channels ],其中 filter_height 为卷积核高度,filter_weight 为卷积核宽度,in_channel 是图像通道数 ,和 input 的 in_channel 要保持一致
out_channel 是卷积核数量
strides: 卷积时在图像每一维的步长,这是一个一维的向量,[ 1, strides, strides, 1],第一位和最后一位固定必须是1
padding: string类型,值为“SAME” 和 “VALID”,表示的是卷积的形式,是否考虑边界。"SAME"是考虑边界,不足的时候用0去填充周围,"VALID"则不考虑
use_cudnn_on_gpu: bool类型,是否使用cudnn加速,默认为true

tf.layers.conv2d

参考博客:https://blog.csdn.net/gqixf/article/details/80519912

conv2d(inputs, filters, kernel_size, 
    strides=(1, 1), 
    padding='valid', 
    data_format='channels_last', 
    dilation_rate=(1, 1),
    activation=None, 
    use_bias=True, 
    kernel_initializer=None,
    bias_initializer=<tensorflow.python.ops.init_ops.Zeros object at 0x000002596A1FD898>, 
    kernel_regularizer=None,
    bias_regularizer=None, 
    activity_regularizer=None, 
    kernel_constraint=None, 
    bias_constraint=None, 
    trainable=True, 
    name=None,
    reuse=None)

作用

2D 卷积层的函数接口 这个层创建了一个卷积核,将输入进行卷积来输出一个 tensor。如果 use_bias 是 True(且提供了 bias_initializer),则一个偏差向量会被加到输出中。最后,如果 activation 不是 None,激活函数也会被应用到输出中。

参数

inputs:Tensor 输入

filters:整数,表示输出空间的维数(即卷积过滤器的数量)

kernel_size:一个整数,或者包含了两个整数的元组/队列,表示卷积窗的高和宽。如果是一个整数,则宽高相等。

strides:一个整数,或者包含了两个整数的元组/队列,表示卷积的纵向和横向的步长。如果是一个整数,则横纵步长相等。另外, strides 不等于1 和 dilation_rate 不等于1 这两种情况不能同时存在。

padding"valid" 或者 "same"(不区分大小写)。"valid" 表示不够卷积核大小的块就丢弃,"same"表示不够卷积核大小的块就补0。 "valid" 的输出形状为

"valid" 的输出形状为其中, 为输入的 size(高或宽), 为 filter 的 size, 为 strides 的大小, 为向上取整。

 

data_formatchannels_last 或者 channels_first,表示输入维度的排序。

tf.pad()

参考博客: https://msd.misuland.com/pd/2884250171976192248

例子1:


第1个维度上,上下各补1行0
第2个维度上,上补1行0,下补2行0

例子2:

例子3:

a= [  [  [1, 1], [2, 2]   ],

        [  [3, 3], [4, 4]   ]  ]

padding

PS:我认为博客答案不对,因此在这里没有写

4 代码实现,以类的形式讲解

改错:(视频里作者写错了,我修正) 
tf.Variable_scope这里应该是tf.variable_scope

存疑# b7 conv7: 1x1x1024 =>第2个检测层
           net = self.conv2d(net, filter=1024, k_size=[3, 3], scope='conv7')  ???
           # => 个数1024, 卷积核不是[1, 1]  

运行出错:

下面代码直接运行会报错:
TypeError: conv2d() got  an unexpected keyword argument 'input'

# 这里是本视频的所有代码
#!usr/bin/python
# -*- coding: utf-8 -*-
# Creation Date: 2019/7/10
import tensorflow as tf
import numpy as np
import cv2


class ssd(object):

    def __init__(self):
        pass  # 先略过

    # ===>l2正则化<===
    def l2norm(self, x, scale, trainable=True, scope='L2Normalization'):
        n_channels = x.get_shape().as_list()[-1]  # 通道数. 得到形状,变成列表,取后一个
        l2_norm = tf.nn.l2_normalize(x, dim=[3], epsilon=1e-12)  # 只对每个像素点在channels上做归一化
        with tf.variable_scope(scope):
            gamma = tf.get_variable("gamma", shape=[n_channels, ], dtype=tf.float32,
                                    initializer=tf.constant_initializer(scale),
                                    trainable=trainable)
            return l2_norm * gamma

    # ===>下面开始定义所需组件<===
    # conv2d, max_pool2d, pad2d, dropout
    # 定义一个卷积的操作 1输入 2卷积核个数 3卷积核大小 4步长 5padding 6膨胀 7激活函数 8名字
    def conv2d(self, x, filter,  # 输入x, 卷积核的个数filter
               k_size, stride=[1, 1],  # k_size卷积核是几*几,步长stride
               padding='same', dilation=[1, 1],  # padding, 空洞卷积指数这里1代表正常卷积
               activation=tf.nn.relu, scope='conv2d'):  # 激活函数relu, 名字scope
        return tf.layers.conv2d(input=x, filters=filter, kernel_size=k_size,
                                strides=stride, padding=padding, dilation_rate=dilation,
                                name=scope, activation=activation)

    def max_pool2d(self, x, pool_size, stride, scope='max_pool2d'):  # 我猜padding是vaild
        return tf.layers.max_pooling2d(inputs=x, pool_size=pool_size, strides=stride, padding='valid', name=scope)

    # 用于填充s=2的第8,9层.    从6层往后的卷积层需要自己填充, 不要用它自带的填充.
    def pad2d(self, x, pad):
        return tf.pad(x, paddings=[[0, 0], [pad, pad], [pad, pad], [0, 0]])

    def dropout(self, x, d_rate=0.5):
        return tf.layers.dropout(inputs=x, rate=d_rate)

    # ===>下面开始写网络架构<===
    def set_net(self):

        check_points = {}  # 装特征层的字典,用于循环迭代

        x = tf.placeholder(dtype=tf.float32, shape=[None, 300, 300, 3])
        with tf.variable_scope('ssd_300_vgg'):
            # ===>VGG前5层<===
            # b1
            net = self.conv2d(x, filter=64, k_size=[3, 3], scope='conv1_1')  # 64个3*3卷积核, s=1 默认,标准卷积
            net = self.conv2d(net, 64, [3, 3], scope='conv1_2')  # 64个3*3卷积核, s=1默认
            net = self.max_pool2d(net, pool_size=[2, 2], stride=[2, 2])  # 池化层2*2卷积核, s=2 默认,池化层一般都是2
            # b2
            net = self.conv2d(net, filter=128, k_size=[3, 3], scope='conv2_1')
            net = self.conv2d(net, 128, [3, 3], scope='conv2_2')
            net = self.max_pool2d(net, pool_size=[2, 2], stride=[2, 2], scope='pool2')
            # b3
            net = self.conv2d(net, filter=256, k_size=[3, 3], scope='conv3_1')
            net = self.conv2d(net, 256, [3, 3], scope='conv3_2')
            net = self.conv2d(net, 256, [3, 3], scope='conv3_3')
            net = self.max_pool2d(net, pool_size=[2, 2], stride=[2, 2], scope='pool3')
            # b4 =>第1个检测层
            net = self.conv2d(net, filter=512, k_size=[3, 3], scope='conv4_1')
            net = self.conv2d(net, 512, [3, 3], scope='conv4_2')
            net = self.conv2d(net, 512, [3, 3], scope='conv4_3')
            check_points['block4'] = net
            net = self.max_pool2d(net, pool_size=[2, 2], stride=[2, 2], scope='pool4')

            # b5 关键部分来了,这里与vgg不同了
            net = self.conv2d(net, filter=512, k_size=[3, 3], scope='conv5_1')
            net = self.conv2d(net, 512, [3, 3], scope='conv5_2')
            net = self.conv2d(net, 512, [3, 3], scope='conv5_3')
            net = self.max_pool2d(net, pool_size=[3, 3], stride=[1, 1], scope='pool5')  # =>池化层3*3核, 步长变成1*1

            # ===>卷积层,代替VGG全连接层<===
            # b6 conv6: 3x3x1024-d6
            net = self.conv2d(net, filter=1024, k_size=[3, 3], dilation=[6, 6], scope='conv6')
            # => 个数1024, dilation=[6, 6]

            # b7 conv7: 1x1x1024 =>第2个检测层
            net = self.conv2d(net, filter=1024, k_size=[3, 3], scope='conv7')
            # => 个数1024, 卷积核不是[1, 1]  ?=?
            check_points['block7'] = net

            # b8 conv8_1: 1x1x256; conv8_2: 3x3x512-s2-vaild =>第3个检测层
            net = self.conv2d(net, 256, [1, 1], scope='conv8_1x1')  # =>个数256,卷积核1x1
            net = self.conv2d(self.pad2d(net, 1), 512, [3, 3], [2, 2], scope='conv8_3x3', padding='valid')
            # =>个数512, 卷积核3x3, 步长2, 'valid'
            check_points['block8'] = net

            # b9 conv9_1: 1x1x128 conv8_2: 3x3x256-s2-vaild =>第4个检测层
            net = self.conv2d(net, 128, [1, 1], scope='conv9_1x1')  # =>个数128,卷积核1x1
            net = self.conv2d(self.pad2d(net, 1), 256, [3, 3], [2, 2], scope='conv9_3x3', padding='valid')
            # =>个数256,卷积核3x3,步长2x2, valid
            check_points['block9'] = net

            # b10 conv10_1: 1x1x128 conv10_2: 3x3x256-s1-valid =>第5个检测层
            net = self.conv2d(net, 128, [1, 1], scope='conv10_1x1')  # =>个数128,卷积核1x1
            net = self.conv2d(net, 256, [3, 3], scope='conv10_3x3', padding='valid')
            # =>个数256,valid
            check_points['block10'] = net

            # b11 conv11_1: 1x1x128 conv11_2: 3x3x256-s1-valid =>第6检测层
            net = self.conv2d(net, 128, [1, 1], scope='conv11_1x1')  # =>个数128,卷积核1x1
            net = self.conv2d(net, 256, [3, 3], scope='conv11_3x3', padding='valid')
            # =>个数256, valid
            check_points['block11'] = net
        print(check_points)


if __name__ == '__main__':
    sd = ssd()
    sd.set_net()

最后两次卷积,
一次卷积得到类别,
一次卷积得到锚点框的位置xywh

注意:因为要加载权重, 名字是不能修改的: conv1_1等,其他的可以修改

 

5 网上下载的作者源代码

import tensorflow as tf
import numpy as np
import cv2



class ssd(object):

    def __init__(self):
        self.feature_map_size = [[38, 38], [19, 19], [10, 10], [5, 5], [3, 3], [1, 1]]
        self.classes = ["aeroplane", "bicycle", "bird", "boat", "bottle",
           "bus", "car", "cat", "chair", "cow", "diningtable",
           "dog", "horse", "motorbike", "person", "pottedplant",
           "sheep", "sofa", "train", "tvmonitor"]
        self.feature_layers = ['block4', 'block7', 'block8', 'block9', 'block10', 'block11']
        self.img_size = (300,300)
        self.num_classes = 21
        self.boxes_len = [4,6,6,6,4,4]
        self.isL2norm = [True,False,False,False,False,False]
        self.anchor_sizes = [[21., 45.], [45., 99.], [99., 153.],[153., 207.],[207., 261.], [261., 315.]]
        self.anchor_ratios = [[2, .5], [2, .5, 3, 1. / 3], [2, .5, 3, 1. / 3],
                         [2, .5, 3, 1. / 3], [2, .5], [2, .5]]
        self.anchor_steps = [8, 16, 32, 64, 100, 300]
        self.prior_scaling = [0.1, 0.1, 0.2, 0.2] #特征图先验框缩放比例
        self.n_boxes = [5776,2166,600,150,36,4]  #8732个
        self.threshold = 0.2

###########    ssd网络架构部分
    def l2norm(self,x, trainable=True, scope='L2Normalization'):
        n_channels = x.get_shape().as_list()[-1]  # 通道数
        l2_norm = tf.nn.l2_normalize(x, dim=[3], epsilon=1e-12)  # 只对每个像素点在channels上做归一化
        with tf.variable_scope(scope):
            gamma = tf.get_variable("gamma", shape=[n_channels, ], dtype=tf.float32,
                                    trainable=trainable)
        return l2_norm * gamma

    def conv2d(self,x,filter,k_size,stride=[1,1],padding='same',dilation=[1,1],activation=tf.nn.relu,scope='conv2d'):
        return tf.layers.conv2d(inputs=x, filters=filter, kernel_size=k_size,
                            strides=stride, dilation_rate=dilation, padding=padding,
                            name=scope, activation=activation)

    def max_pool2d(self,x, pool_size, stride, scope='max_pool2d'):
        return tf.layers.max_pooling2d(inputs=x, pool_size=pool_size, strides=stride, name=scope, padding='same')

    def pad2d(self,x, pad):
        return tf.pad(x, paddings=[[0, 0], [pad, pad], [pad, pad], [0, 0]])

    def dropout(self,x, d_rate=0.5):
        return tf.layers.dropout(inputs=x, rate=d_rate)

    def ssd_prediction(self, x, num_classes, box_num, isL2norm, scope='multibox'):
        reshape = [-1] + x.get_shape().as_list()[1:-1]  # 去除第一个和最后一个得到shape
        with tf.variable_scope(scope):
            if isL2norm:
                x = self.l2norm(x)
                print(x)
            # #预测位置  --》 坐标和大小  回归
            location_pred = self.conv2d(x, filter=box_num * 4, k_size=[3,3], activation=None,scope='conv_loc')
            location_pred = tf.reshape(location_pred, reshape + [box_num, 4])
            # 预测类别   --> 分类 sofrmax
            class_pred = self.conv2d(x, filter=box_num * num_classes, k_size=[3,3], activation=None, scope='conv_cls')
            class_pred = tf.reshape(class_pred, reshape + [box_num, num_classes])
            print(location_pred, class_pred)
            return location_pred, class_pred



    def set_net(self):

        check_points = {}
        predictions = []
        locations = []

        x = tf.placeholder(dtype=tf.float32,shape=[None,300,300,3])
        with tf.variable_scope('ssd_300_vgg'):
            #b1
            net = self.conv2d(x,filter=64,k_size=[3,3],scope='conv1_1')
            net = self.conv2d(net,64,[3,3],scope='conv1_2')
            net = self.max_pool2d(net,pool_size=[2,2],stride=[2,2],scope='pool1')
            #b2
            net = self.conv2d(net, filter=128, k_size=[3, 3], scope='conv2_1')
            net = self.conv2d(net, 128, [3, 3], scope='conv2_2')
            net = self.max_pool2d(net, pool_size=[2, 2], stride=[2, 2], scope='pool2')
            #b3
            net = self.conv2d(net, filter=256, k_size=[3, 3], scope='conv3_1')
            net = self.conv2d(net, 256, [3, 3], scope='conv3_2')
            net = self.conv2d(net, 256, [3, 3], scope='conv3_3')
            net = self.max_pool2d(net, pool_size=[2, 2], stride=[2, 2], scope='pool3')
            #b4
            net = self.conv2d(net, filter=512, k_size=[3, 3], scope='conv4_1')
            net = self.conv2d(net, 512, [3, 3], scope='conv4_2')
            net = self.conv2d(net, 512, [3, 3], scope='conv4_3')
            check_points['block4'] = net
            net = self.max_pool2d(net, pool_size=[2, 2], stride=[2, 2], scope='pool4')
            #b5
            net = self.conv2d(net, filter=512, k_size=[3, 3], scope='conv5_1')
            net = self.conv2d(net, 512, [3, 3], scope='conv5_2')
            net = self.conv2d(net, 512, [3, 3], scope='conv5_3')
            net = self.max_pool2d(net, pool_size=[3, 3], stride=[1, 1], scope='pool4')
            #b6
            net = self.conv2d(net,1024,[3,3],dilation=[6,6],scope='conv6')
            #b7
            net = self.conv2d(net,1024,[1,1],scope='conv7')
            check_points['block7'] = net
            #b8
            net = self.conv2d(net,256,[1,1],scope='conv8_1x1')
            net = self.conv2d(self.pad2d(net,1),512,[3,3],[2,2],scope='conv8_3x3',padding='valid')
            check_points['block8'] = net
            #b9
            net = self.conv2d(net, 128, [1, 1], scope='conv9_1x1')
            net = self.conv2d(self.pad2d(net,1), 256, [3, 3], [2, 2], scope='conv9_3x3', padding='valid')
            check_points['block9'] = net
            #b10
            net = self.conv2d(net, 128, [1, 1], scope='conv10_1x1')
            net = self.conv2d(net, 256, [3, 3], scope='conv10_3x3', padding='valid')
            check_points['block10'] = net
            #b11
            net = self.conv2d(net, 128, [1, 1], scope='conv11_1x1')
            net = self.conv2d(net, 256, [3, 3], scope='conv11_3x3', padding='valid')
            check_points['block11'] = net
            for i,j in enumerate(self.feature_layers):
                loc,cls = self.ssd_prediction(
                                    x = check_points[j],
                                    num_classes = self.num_classes,
                                    box_num = self.boxes_len[i],
                                    isL2norm = self.isL2norm[i],
                                    scope = j + '_box'
                                    )
                predictions.append(tf.nn.softmax(cls))
                locations.append(loc)
            return locations,predictions,x

###########    ssd网络架构部分结束

##########    先验框部分开始

    #先验框生成
    def ssd_anchor_layer(self,img_size,feature_map_size,anchor_size,anchor_ratio,anchor_step,box_num,offset=0.5):

        y,x = np.mgrid[0:feature_map_size[0],0:feature_map_size[1]]

        y = (y.astype(np.float32) + offset) * anchor_step /img_size[0]
        x = (x.astype(np.float32) + offset) * anchor_step /img_size[1]

        y = np.expand_dims(y,axis=-1)
        x = np.expand_dims(x,axis=-1)
        #计算两个长宽比为1的h、w

        h = np.zeros((box_num,),np.float32)
        w = np.zeros((box_num,),np.float32)

        h[0] = anchor_size[0] /img_size[0]
        w[0] = anchor_size[0] /img_size[0]
        h[1] = (anchor_size[0] * anchor_size[1]) ** 0.5 / img_size[0]
        w[1] = (anchor_size[0] * anchor_size[1]) ** 0.5 / img_size[1]


        for i,j in enumerate(anchor_ratio):
            h[i + 2] = anchor_size[0] / img_size[0] / (j ** 0.5)
            w[i + 2] = anchor_size[0] / img_size[1] * (j ** 0.5)

        return y,x,h,w

    #解码网络
    def ssd_decode(self,location,box,prior_scaling):
        y_a, x_a, h_a, w_a = box

        cx = location[:, :, :, :, 0] * w_a * prior_scaling[0] + x_a  #########################
        cy = location[:, :, :, :, 1] * h_a * prior_scaling[1] + y_a
        w = w_a * tf.exp(location[:, :, :, :, 2] * prior_scaling[2])
        h = h_a * tf.exp(location[:, :, :, :, 3] * prior_scaling[3])
        print(cx, cy, w, h)

        bboxes = tf.stack([cy - h / 2.0, cx - w / 2.0, cy + h / 2.0, cx + w / 2.0], axis=-1)

        return bboxes


    #先验框筛选
    def choose_anchor_boxes(self, predictions, anchor_box, n_box):
        anchor_box = tf.reshape(anchor_box, [n_box, 4])
        prediction = tf.reshape(predictions, [n_box, 21])
        prediction = prediction[:, 1:]
        classes = tf.argmax(prediction, axis=1) + 1
        scores = tf.reduce_max(prediction, axis=1)


        filter_mask = scores > self.threshold
        classes = tf.boolean_mask(classes, filter_mask)
        scores = tf.boolean_mask(scores, filter_mask)
        anchor_box = tf.boolean_mask(anchor_box, filter_mask)

        return classes, scores, anchor_box

########## 先验框部分结束

######### 训练部分开始

    def bboxes_sort(self,classes, scores, bboxes, top_k=400):
        idxes = np.argsort(-scores)
        classes = classes[idxes][:top_k]
        scores = scores[idxes][:top_k]
        bboxes = bboxes[idxes][:top_k]
        return classes, scores, bboxes

    # 计算IOU
    def bboxes_iou(self,bboxes1, bboxes2):
        bboxes1 = np.transpose(bboxes1)
        bboxes2 = np.transpose(bboxes2)

        # 计算两个box的交集:交集左上角的点取两个box的max,交集右下角的点取两个box的min
        int_ymin = np.maximum(bboxes1[0], bboxes2[0])
        int_xmin = np.maximum(bboxes1[1], bboxes2[1])
        int_ymax = np.minimum(bboxes1[2], bboxes2[2])
        int_xmax = np.minimum(bboxes1[3], bboxes2[3])

        # 计算两个box交集的wh:如果两个box没有交集,那么wh为0(按照计算方式wh为负数,跟0比较取最大值)
        int_h = np.maximum(int_ymax - int_ymin, 0.)
        int_w = np.maximum(int_xmax - int_xmin, 0.)

        # 计算IOU
        int_vol = int_h * int_w  # 交集面积
        vol1 = (bboxes1[2] - bboxes1[0]) * (bboxes1[3] - bboxes1[1])  # bboxes1面积
        vol2 = (bboxes2[2] - bboxes2[0]) * (bboxes2[3] - bboxes2[1])  # bboxes2面积
        iou = int_vol / (vol1 + vol2 - int_vol)  # IOU=交集/并集
        return iou

    # NMS
    def bboxes_nms(self,classes, scores, bboxes, nms_threshold=0.5):
        keep_bboxes = np.ones(scores.shape, dtype=np.bool)
        for i in range(scores.size - 1):
            if keep_bboxes[i]:
                overlap = self.bboxes_iou(bboxes[i], bboxes[(i + 1):])
                keep_overlap = np.logical_or(overlap < nms_threshold, classes[(i + 1):] != classes[i])
                keep_bboxes[(i + 1):] = np.logical_and(keep_bboxes[(i + 1):], keep_overlap)
        idxes = np.where(keep_bboxes)
        return classes[idxes], scores[idxes], bboxes[idxes]


######## 训练部分结束

    def handle_img(self,img_path):
        means = np.array((123., 117., 104.))
        self.img = cv2.imread(img_path)
        img = np.expand_dims(cv2.resize(cv2.cvtColor(self.img, cv2.COLOR_BGR2RGB) - means,self.img_size),axis=0)
        return img


    def draw_rectangle(self,img, classes, scores, bboxes, colors, thickness=2):
        shape = img.shape
        for i in range(bboxes.shape[0]):
            bbox = bboxes[i]
            # color = colors[classes[i]]
            p1 = (int(bbox[0] * shape[0]), int(bbox[1] * shape[1]))
            p2 = (int(bbox[2] * shape[0]), int(bbox[3] * shape[1]))
            cv2.rectangle(img, p1[::-1], p2[::-1], colors[0], thickness)
            # Draw text...
            s = '%s/%.3f' % (self.classes[classes[i] - 1], scores[i])
            p1 = (p1[0] - 5, p1[1])
            cv2.putText(img, s, p1[::-1], cv2.FONT_HERSHEY_DUPLEX, 0.5, colors[1], 1)
        cv2.namedWindow("img", 0);
        cv2.resizeWindow("img", 640, 480);
        cv2.imshow('img', img)
        cv2.waitKey(0)
        cv2.destroyAllWindows()

    def run_this(self,locations,predictions):

        layers_anchors = []
        classes_list = []
        scores_list = []
        bboxes_list = []
        for i, s in enumerate(self.feature_map_size):
            anchor_bboxes = self.ssd_anchor_layer(self.img_size, s,
                                                  self.anchor_sizes[i],
                                                  self.anchor_ratios[i],
                                                  self.anchor_steps[i],
                                                  self.boxes_len[i])
            layers_anchors.append(anchor_bboxes)
        for i in range(len(predictions)):
            d_box = self.ssd_decode(locations[i], layers_anchors[i], self.prior_scaling)
            cls, sco, box = self.choose_anchor_boxes(predictions[i], d_box, self.n_boxes[i])
            classes_list.append(cls)
            scores_list.append(sco)
            bboxes_list.append(box)
        classes = tf.concat(classes_list, axis=0)
        scores = tf.concat(scores_list, axis=0)
        bboxes = tf.concat(bboxes_list, axis=0)
        return classes,scores,bboxes


'''
只要修改
img = sd.handle_img('tetst.jpg') 这一行代码就好啦,把你想预测的图片放进去
'''


if __name__ == '__main__':
    sd = ssd()
    locations, predictions, x = sd.set_net()
    classes, scores, bboxes = sd.run_this(locations, predictions)
    sess = tf.Session()
    ckpt_filename = 'ssd_vgg_300_weights.ckpt'
    sess.run(tf.global_variables_initializer())
    saver = tf.train.Saver()
    saver.restore(sess, ckpt_filename)
    img = sd.handle_img('tetst.jpg')

    rclasses, rscores, rbboxes = sess.run([classes, scores, bboxes], feed_dict={x: img})
    rclasses, rscores, rbboxes = sd.bboxes_sort(rclasses, rscores, rbboxes)

    rclasses, rscores, rbboxes = sd.bboxes_nms(rclasses, rscores, rbboxes)

    sd.draw_rectangle(sd.img,rclasses,rscores,rbboxes,[[0,0,255],[255,0,0]])






 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

计算机视觉-Archer

图像分割没有团队的同学可加群

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值