Faster-RCNN简介及损失函数

最新推荐文章于 2023-04-23 17:17:51 发布

qq_32790593

最新推荐文章于 2023-04-23 17:17:51 发布

阅读量3.7k

点赞数 2

本文链接：https://blog.csdn.net/qq_32790593/article/details/113317084

版权

【Faster RCNN】损失函数理解：https://blog.csdn.net/Mr_health/article/details/84970776

1. 使用Smoooh L1 Loss的原因

关于文章中具体一些代码及参数如何得来的请看博客：

tensorflow+faster rcnn代码解析（二）：anchor_target_layer、proposal_target_layer、proposal_layer

最近又重新学习了一遍Faster RCNN有挺多收获的，在此重新记录一下。

1. 使用Smoooh L1 Loss的原因

对于边框的预测是一个回归问题。通常可以选择平方损失函数（L2损失）f(x)=x^2。但这个损失对于比较大的误差的惩罚很高。

我们可以采用稍微缓和一点绝对损失函数（L1损失）f(x)=|x|，它是随着误差线性增长，而不是平方增长。但这个函数在0点处导数不存在，因此可能会影响收敛。

一个通常的解决办法是，分段函数，在0点附近使用平方函数使得它更加平滑。它被称之为平滑L1损失函数。它通过一个参数σ 来控制平滑的区域。一般情况下σ = 1，在faster rcnn函数中σ = 3

2. Faster RCNN的损失函数

Faster RCNN的的损失主要分为RPN的损失和Fast RCNN的损失，计算公式如下，并且两部分损失都包括分类损失（cls loss）和回归损失（bbox regression loss）。

下面分别讲一下RPN和fast RCNN部分的损失。

2.1 分类损失

公式：

（1）RPN分类损失：

RPN网络的产生的anchor只分为前景和背景，前景的标签为1，背景的标签为0。在训练RPN的过程中，会选择256个anchor，256就是公式中的Ncls

可以看到这是一个这经典的二分类交叉熵损失，对于每一个anchor计算对数损失，然后求和除以总的anchor数量Ncls。这部分的代码tensorflow代码如下：


 
 
   
   
    
    
   
   
   
   
    
    
     
     rpn_cls_score = tf.reshape(self._predictions[
     
     'rpn_cls_score_reshape'], [
     
     -1, 
     
     2]) 
     
     #rpn_cls_score = （17100,2）
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     rpn_label = tf.reshape(self._anchor_targets[
     
     'rpn_labels'], [
     
     -1])  
     
     #rpn_label = (17100,)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     rpn_select = tf.where(tf.not_equal(rpn_label, 
     
     -1)) 
     
     #将不等于-1的labels选出来（也就是正负样本选出来）,返回序号
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     rpn_cls_score = tf.reshape(tf.gather(rpn_cls_score, rpn_select), [
     
     -1, 
     
     2]) 
     
     #同时选出对应的分数
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     rpn_label = tf.reshape(tf.gather(rpn_label, rpn_select), [
     
     -1])
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     rpn_cross_entropy = tf.reduce_mean(
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         tf.nn.sparse_softmax_cross_entropy_with_logits(logits=rpn_cls_score, labels=rpn_label))

假设我们RPN网络的特征图大小为38×50，那么就会产生38×50×9=17100个anchor，然后在RPN的训练阶段会从17100个anchor中挑选Ncls个anchor用来训练RPN的参数，其中挑选为前景的标签为1，背景的标签为0。

代码第一行将其reshape变为（17100，2），行数表示anchor的数量，列数为前景和背景，表示属于前景和背景的分数。
代码第二行和第三行，将RPN的label也reshape成（17100，）,即分别对应上anchor，然后从中选出不等于-1的，也就是选择出前景和背景，数量为Ncls，返回其index，为rpn_select。
代码第四行，根据index选择出对应的分数。
第五行，根据rpn_label和rpn_cls_score计算交叉熵损失。其中reduce_mean函数就是除以个数（Ncls）求平均。

（2）Fast RCNN分类损失：

RPN的分类损失时二分类的交叉熵损失，而Fast RCNN是多分类的交叉熵损失（当你训练的类别数>2时，这里假定类别数为5）。在Fast RCNN的训练过程中会选出128个rois，即Ncls = 128，标签的值就是0到4。代码为：


 
 
   
   
    
    
   
   
   
   
    
    
     
     cross_entropy = 
     
     tf.reduce_mean(
    
    
   
   

   
   
    
    
   
   
   
   
    
                  
     
     tf.nn.sparse_softmax_cross_entropy_with_logits(
    
    
   
   

   
   
    
    
   
   
   
   
    
                  
     
     logits=
     
     tf.reshape(cls_score, [-1, self._num_classes]), labels=label))

2.2 回归损失

回归损失这块就RPN和Fast RCNN一起讲，公式为：

其中：

是一个向量，表示anchor，RPN训练阶段（rois，FastRCNN阶段）预测的偏移量。
是与ti维度相同的向量，表示anchor，RPN训练阶段（rois，FastRCNN阶段）相对于gt实际的偏移量

R是smoothL1 函数，就是我们上面说的，不同之处是这里σ = 3，RPN训练（σ = 1，Fast RCNN训练），

对于每一个anchor 计算完部分后还要乘以P*，如前所述，P*有物体时（positive）为1，没有物体（negative）时为0，意味着只有前景才计算损失，背景不计算损失。inside_weights就是这个作用。

对于和Nreg的解释在RPN训练过程中如下（之所以以RPN训练为前提因为此时batch size = 256，如果是fast rcnn，batchsize = 128）：

所以就是outside_weights，没有前景（fg）也没有后景（bg）的为0，其他为1/（bg+fg）=Ncls。

代码：


 
 
   
   
    
    
   
   
   
   
    
        
     
     def _smooth_l1_loss(self, bbox_pred, bbox_targets, bbox_inside_weights, bbox_outside_weights, sigma=1.0, dim=[1]):
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             sigma_2 = sigma ** 
     
     2
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             box_diff = bbox_pred - bbox_targets 
     
     #ti-ti* 
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             in_box_diff = bbox_inside_weights * box_diff  
     
     #前景才有计算损失的资格
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             abs_in_box_diff = tf.abs(in_box_diff) 
     
     #x = |ti-ti*|
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             smoothL1_sign = tf.stop_gradient(tf.to_float(tf.less(abs_in_box_diff, 
     
     1. / sigma_2))) 
     
     #判断smoothL1输入的大小，如果x = |ti-ti*|小于就返回1，否则返回0
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     #计算smoothL1损失
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             in_loss_box = tf.pow(in_box_diff, 
     
     2) * (sigma_2 / 
     
     2.) * smoothL1_sign + (abs_in_box_diff - (
     
     0.5 / sigma_2)) * (
     
     1. - smoothL1_sign)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             out_loss_box = bbox_outside_weights * in_loss_box
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             loss_box = tf.reduce_mean(tf.reduce_sum(
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 out_loss_box,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 axis=dim
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             ))
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     return loss_box

一些感悟

论文中把Ncls，Nreg和都看做是平衡分类损失和回归损失的归一化权重，但是我在看tensorflow代码实现faster rcnn的损失时发现（这里以fast rcnn部分的分类损失和box回归损失为例，如下），可以看到在计算分类损失时，并没有输入Ncls这个参数，只是在计算box回归损失的时候输入了outside_weights这个参数。这时候我才意识到分类损失是交叉熵函数，求和后会除以总数量，除以Ncls已经包含到交叉熵函数本身。

为了平衡两种损失的权重，outside_weights的取值取决于Ncls，而Ncls的取值取决于batch_size。因此才会有


 
 
   
   
    
    
   
   
   
   
    
                
     
     # RCNN, class loss
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 cls_score = 
     
     self._predictions[
     
     "cls_score"]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 label = tf.reshape(
     
     self._proposal_targets[
     
     "labels"], [-
     
     1])
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 cross_entropy = tf.reduce_mean(
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                     tf.nn.sparse_softmax_cross_entropy_with_logits(
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                         logits=tf.reshape(cls_score, [-
     
     1, 
     
     self._num_classes]), labels=label))
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
                
     
     # RCNN, bbox loss
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 bbox_pred = 
     
     self._predictions[
     
     'bbox_pred'] 
     
     #(128,12)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 bbox_targets = 
     
     self._proposal_targets[
     
     'bbox_targets'] 
     
     #(128,12)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 bbox_inside_weights = 
     
     self._proposal_targets[
     
     'bbox_inside_weights']
     
     #(128,12)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 bbox_outside_weights = 
     
     self._proposal_targets[
     
     'bbox_outside_weights']
     
     #(128,12)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 loss_box = 
     
     self._smooth_l1_loss(bbox_pred, bbox_targets, bbox_inside_weights, bbox_outside_weights)

Alexnet网络结构详解：https://blog.csdn.net/weixin_39873397/article/details/8456872

第一个卷积层

输入的图片大小为:224*224*3（或者是227*227*3）

第一个卷积层为:11*11*96即尺寸为11*11,有96个卷积核,步长为4,卷积层后跟ReLU,因此输出的尺寸为 224/4=56,去掉边缘为55,因此其输出的每个feature map 为 55*55*96,同时后面跟LRN层,尺寸不变.

最大池化层,核大小为3*3,步长为2,因此feature map的大小为:27*27*96.

第二层卷积层

输入的tensor为27*27*96

卷积和的大小为: 5*5*256,步长为1,尺寸不会改变,同样紧跟ReLU,和LRN层.

最大池化层,和大小为3*3,步长为2,因此feature map为:13*13*256

第三层至第五层卷积层

输入的tensor为13*13*256

第三层卷积为 3*3*384,步长为1,加上ReLU

第四层卷积为 3*3*384,步长为1,加上ReLU

第五层卷积为 3*3*256,步长为1,加上ReLU

第五层后跟最大池化层,核大小3*3,步长为2,因此feature map:6*6*256

第六层至第八层全连接层

接下来的三层为全连接层,分别为:
1. FC : 4096 + ReLU
2. FC:4096 + ReLU
3. FC: 1000 最后一层为softmax为1000类的概率值.

2. AlexNet中的trick

AlexNet将CNN用到了更深更宽的网络中,其效果分类的精度更高相比于以前的LeNet,其中有一些trick是必须要知道的.

ReLU的应用

AlexNet使用ReLU代替了Sigmoid,其能更快的训练,同时解决sigmoid在训练较深的网络中出现的梯度消失,或者说梯度弥散的问题.

Dropout随机失活

随机忽略一些神经元,以避免过拟合,

重叠的最大池化层

在以前的CNN中普遍使用平均池化层,AlexNet全部使用最大池化层,避免了平均池化层的模糊化的效果,并且步长比池化的核的尺寸小,这样池化层的输出之间有重叠,提升了特征的丰富性.

提出了LRN层

局部响应归一化,对局部神经元创建了竞争的机制,使得其中响应小打的值变得更大,并抑制反馈较小的.

使用了GPU加速计算

使用了gpu加速神经网络的训练

数据增强

使用数据增强的方法缓解过拟合现象.

3. Tensorflow实现AlexNet


 
 
   
   
    
    
   
   
   
   
    
    
     
     def print_activations(t):
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       print(t.op.name, 
     
     ' ', t.get_shape().as_list())

上面的函数为输出当前层的参数的信息.下面是我对开源实现做了一些参数上的修改,代码如下:


 
 
   
   
    
    
   
   
   
   
    
    
     
     def inference(images):
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       """Build the AlexNet model.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       Args:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         images: Images Tensor
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       Returns:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         pool5: the last Tensor in the convolutional component of AlexNet.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         parameters: a list of Tensors corresponding to the weights and biases of the
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             AlexNet model.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       """
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       parameters = []
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # conv1
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     with tf.name_scope(
     
     'conv1') 
     
     as 
     
     scope:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         kernel = tf.Variable(tf.truncated_normal([
     
     11, 
     
     11, 
     
     3, 
     
     96], dtype=tf.float32,
    
    
   
   

   
   
    
    
   
   
   
   
    
                                                 
     
     stddev=
     
     1e-1), 
     
     name=
     
     'weights')
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     conv = tf.nn.conv2d(images, kernel, [
     
     1, 
     
     4, 
     
     4, 
     
     1], padding=
     
     'VALID')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         biases = tf.Variable(tf.constant(
     
     0.0, shape=[
     
     96], dtype=tf.float32),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              trainable=
     
     True, 
     
     name=
     
     'biases')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         bias = tf.nn.bias_add(
     
     conv, biases)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         conv1 = tf.nn.relu(bias, 
     
     name=
     
     scope)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         print_activations(conv1)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     parameters += [kernel, biases]
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # lrn1
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # TODO(shlens, jiayq): Add a GPU version of local response normalization.
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # pool1
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       pool1 = tf.nn.max_pool(conv1,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              ksize=[
     
     1, 
     
     3, 
     
     3, 
     
     1],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              strides=[
     
     1, 
     
     2, 
     
     2, 
     
     1],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              padding=
     
     'VALID',
    
    
   
   

   
   
    
    
   
   
   
   
    
                             
     
     name=
     
     'pool1')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       print_activations(pool1)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # conv2
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     with tf.name_scope(
     
     'conv2') 
     
     as 
     
     scope:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         kernel = tf.Variable(tf.truncated_normal([
     
     5, 
     
     5, 
     
     96, 
     
     256], dtype=tf.float32,
    
    
   
   

   
   
    
    
   
   
   
   
    
                                                 
     
     stddev=
     
     1e-1), 
     
     name=
     
     'weights')
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     conv = tf.nn.conv2d(pool1, kernel, [
     
     1, 
     
     1, 
     
     1, 
     
     1], padding=
     
     'SAME')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         biases = tf.Variable(tf.constant(
     
     0.0, shape=[
     
     256], dtype=tf.float32),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              trainable=
     
     True, 
     
     name=
     
     'biases')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         bias = tf.nn.bias_add(
     
     conv, biases)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         conv2 = tf.nn.relu(bias, 
     
     name=
     
     scope)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     parameters += [kernel, biases]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       print_activations(conv2)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # pool2
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       pool2 = tf.nn.max_pool(conv2,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              ksize=[
     
     1, 
     
     3, 
     
     3, 
     
     1],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              strides=[
     
     1, 
     
     2, 
     
     2, 
     
     1],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              padding=
     
     'VALID',
    
    
   
   

   
   
    
    
   
   
   
   
    
                             
     
     name=
     
     'pool2')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       print_activations(pool2)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # conv3
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     with tf.name_scope(
     
     'conv3') 
     
     as 
     
     scope:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         kernel = tf.Variable(tf.truncated_normal([
     
     3, 
     
     3, 
     
     256, 
     
     384],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                                  dtype=tf.float32,
    
    
   
   

   
   
    
    
   
   
   
   
    
                                                 
     
     stddev=
     
     1e-1), 
     
     name=
     
     'weights')
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     conv = tf.nn.conv2d(pool2, kernel, [
     
     1, 
     
     1, 
     
     1, 
     
     1], padding=
     
     'SAME')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         biases = tf.Variable(tf.constant(
     
     0.0, shape=[
     
     384], dtype=tf.float32),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              trainable=
     
     True, 
     
     name=
     
     'biases')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         bias = tf.nn.bias_add(
     
     conv, biases)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         conv3 = tf.nn.relu(bias, 
     
     name=
     
     scope)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     parameters += [kernel, biases]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         print_activations(conv3)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # conv4
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     with tf.name_scope(
     
     'conv4') 
     
     as 
     
     scope:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         kernel = tf.Variable(tf.truncated_normal([
     
     3, 
     
     3, 
     
     384, 
     
     384],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                                  dtype=tf.float32,
    
    
   
   

   
   
    
    
   
   
   
   
    
                                                 
     
     stddev=
     
     1e-1), 
     
     name=
     
     'weights')
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     conv = tf.nn.conv2d(conv3, kernel, [
     
     1, 
     
     1, 
     
     1, 
     
     1], padding=
     
     'SAME')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         biases = tf.Variable(tf.constant(
     
     0.0, shape=[
     
     384], dtype=tf.float32),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              trainable=
     
     True, 
     
     name=
     
     'biases')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         bias = tf.nn.bias_add(
     
     conv, biases)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         conv4 = tf.nn.relu(bias, 
     
     name=
     
     scope)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     parameters += [kernel, biases]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         print_activations(conv4)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # conv5
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     with tf.name_scope(
     
     'conv5') 
     
     as 
     
     scope:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         kernel = tf.Variable(tf.truncated_normal([
     
     3, 
     
     3, 
     
     384, 
     
     256],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                                  dtype=tf.float32,
    
    
   
   

   
   
    
    
   
   
   
   
    
                                                 
     
     stddev=
     
     1e-1), 
     
     name=
     
     'weights')
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     conv = tf.nn.conv2d(conv4, kernel, [
     
     1, 
     
     1, 
     
     1, 
     
     1], padding=
     
     'SAME')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         biases = tf.Variable(tf.constant(
     
     0.0, shape=[
     
     256], dtype=tf.float32),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              trainable=
     
     True, 
     
     name=
     
     'biases')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         bias = tf.nn.bias_add(
     
     conv, biases)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         conv5 = tf.nn.relu(bias, 
     
     name=
     
     scope)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     parameters += [kernel, biases]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         print_activations(conv5)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     # pool5
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       pool5 = tf.nn.max_pool(conv5,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              ksize=[
     
     1, 
     
     3, 
     
     3, 
     
     1],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              strides=[
     
     1, 
     
     2, 
     
     2, 
     
     1],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                              padding=
     
     'VALID',
    
    
   
   

   
   
    
    
   
   
   
   
    
                             
     
     name=
     
     'pool5')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       print_activations(pool5)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     return pool5, 
     
     parameters
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     def time_tensorflow_run(
     
     session, target, info_string):
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     """Run the computation to obtain the target tensor and print timing stats.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       Args:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         session: the TensorFlow session to run the computation under.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         target: the target Tensor that is passed to the session's run() function.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         info_string: a string summarizing this run, to be printed with the stats.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       Returns:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         None
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       """
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       num_steps_burn_in = 
     
     10
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       total_duration = 
     
     0.0
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       total_duration_squared = 
     
     0.0
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     for i 
     
     in xrange(FLAGS.num_batches + num_steps_burn_in):
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         start_time = time.time()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         _ = session.run(target)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     duration = time.time() - start_time
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if i >= num_steps_burn_in:
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     if 
     
     not i % 
     
     10:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             print (
     
     '%s: step %d, duration = %.3f' %
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                    (datetime.now(), i - num_steps_burn_in, 
     
     duration))
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           total_duration += 
     
     duration
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
           total_duration_squared += 
     
     duration * 
     
     duration
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       mn = total_duration / FLAGS.num_batches
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       vr = total_duration_squared / FLAGS.num_batches - mn * mn
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       sd = math.sqrt(vr)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       print (
     
     '%s: %s across %d steps, %.3f +/- %.3f sec / batch' %
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
              (datetime.now(), info_string, FLAGS.num_batches, mn, sd))

测试的函数:
image是随机生成的数据,不是真实的数据


 
 
   
   
    
    
   
   
   
   
    
    
     
     def run_benchmark():
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       """Run the benchmark on AlexNet."""
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     with tf.Graph().as_default():
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Generate some dummy images.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         image_size = 
     
     224
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Note that our padding definition is slightly different the cuda-convnet.
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # In order to force the model to start with the same activations sizes,
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # we add 3 to the image_size and employ VALID padding above.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         images = tf.Variable(tf.random_normal([FLAGS.batch_size,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                                image_size,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                                image_size, 
     
     3],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                               dtype=tf.float32,
    
    
   
   

   
   
    
    
   
   
   
   
    
                                              
     
     stddev=
     
     1e-1))
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Build a Graph that computes the logits predictions from the
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # inference model.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         pool5, 
     
     parameters = inference(images)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Build an initialization operation.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         init = tf.global_variables_initializer()
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Start running operations on the Graph.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         config = tf.ConfigProto()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         config.gpu_options.allocator_type = 
     
     'BFC'
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         sess = tf.Session(config=config)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         sess.run(init)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Run the forward benchmark.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         time_tensorflow_run(sess, pool5, 
     
     "Forward")
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Add a simple objective so we can calculate the backward pass.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         objective = tf.nn.l2_loss(pool5)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Compute the gradient with respect to all the parameters.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         grad = tf.gradients(objective, 
     
     parameters)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Run the backward benchmark.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         time_tensorflow_run(sess, grad, 
     
     "Forward-backward")
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     def 
     
     main(_):
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       run_benchmark()
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     if __name__ == 
     
     '__main__':
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       parser = argparse.ArgumentParser()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       parser.add_argument(
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     '--batch_size',
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     type=
     
     int,
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     default=
     
     128,
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     help=
     
     'Batch size.'
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       )
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       parser.add_argument(
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     '--num_batches',
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     type=
     
     int,
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     default=
     
     100,
    
    
   
   

   
   
    
    
   
   
   
   
    
          
     
     help=
     
     'Number of batches to run.'
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       )
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
       FLAGS, unparsed = parser.parse_known_args()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.run(
     
     main=
     
     main, argv=[sys.argv[
     
     0]] + unparsed)

输出的结果为:
下面为输出的尺寸,具体的分析过程上面已经说的很详细了.

下面是训练的前后向耗时,可以看到后向传播比前向要慢3倍.


 
 
   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     36.
     
     936271: step 
     
     0, duration = 
     
     0.
     
     085
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     37.
     
     860652: step 
     
     10, duration = 
     
     0.
     
     085
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     38.
     
     794103: step 
     
     20, duration = 
     
     0.
     
     100
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     39.
     
     726452: step 
     
     30, duration = 
     
     0.
     
     099
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     40.
     
     637597: step 
     
     40, duration = 
     
     0.
     
     088
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     41.
     
     546659: step 
     
     50, duration = 
     
     0.
     
     078
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     42.
     
     471295: step 
     
     60, duration = 
     
     0.
     
     085
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     43.
     
     389295: step 
     
     70, duration = 
     
     0.
     
     095
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     44.
     
     306961: step 
     
     80, duration = 
     
     0.
     
     085
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     45.
     
     225164: step 
     
     90, duration = 
     
     0.
     
     085
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     46.
     
     058470: Forward across 
     
     100 steps, 
     
     0.
     
     092 +/- 
     
     0.
     
     008 sec / batch
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     50.
     
     335397: step 
     
     0, duration = 
     
     0.
     
     281
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     53.
     
     041129: step 
     
     10, duration = 
     
     0.
     
     279
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     55.
     
     747921: step 
     
     20, duration = 
     
     0.
     
     269
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     49:
     
     58.
     
     454006: step 
     
     30, duration = 
     
     0.
     
     269
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     50:
     
     01.
     
     176237: step 
     
     40, duration = 
     
     0.
     
     285
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     50:
     
     03.
     
     882712: step 
     
     50, duration = 
     
     0.
     
     269
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     50:
     
     06.
     
     573259: step 
     
     60, duration = 
     
     0.
     
     269
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     50:
     
     09.
     
     286011: step 
     
     70, duration = 
     
     0.
     
     270
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     50:
     
     12.
     
     007992: step 
     
     80, duration = 
     
     0.
     
     275
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     50:
     
     14.
     
     706777: step 
     
     90, duration = 
     
     0.
     
     262
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     2018-
     
     11-
     
     27 
     
     17:
     
     50:
     
     17.
     
     138761: Forward-backward across 
     
     100 steps, 
     
     0.
     
     271 +/- 
     
     0.
     
     006 sec / batch
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     An exception has occurred, use %tb to see the full traceback.

qq_32790593

关注

2
点赞
踩
12

收藏

觉得还不错? 一键收藏
1
评论
Faster-RCNN简介及损失函数

【Faster RCNN】损失函数理解：https://blog.csdn.net/Mr_health/article/details/849707761. 使用Smoooh L1 Loss的原因 2. Faster RCNN的损失函数 2.1 分类损失 2.2 回归损失一些感悟关于文章中具体一些代码及参数如何得来的请看博客： tensorflow+faster rcnn代码解析（二）：anchor_target_layer、proposal_target_layer、prop
复制链接

扫一扫