win10系统下Tensorflow Faster RCNN 的安装配置与测试

最新推荐文章于 2023-10-31 16:26:17 发布

ruolyn

最新推荐文章于 2023-10-31 16:26:17 发布

阅读量2.8k

点赞数 3

分类专栏： ubuntu

ubuntu 专栏收录该内容

34 篇文章 9 订阅

订阅专栏

博客原文链接：https://blog.csdn.net/kebi199312/article/details/88368904

最近在学习深度学习的目标检测(Objection Detection)，对比了YOLO V3、Faster R-CNN、SSD及其它的目标检测算法，觉得Faster RCNN的性能不错，并且安装也比较简单，于是就自己动手在win10系统下安装了Faster RCNN。

本文分为两部分：

在win10系统上配置Tensorflow版本的Faster RCNN
运行Faster RCNN程序，测试了图片和视频

一、环境配置：

1、环境

win10系统，显卡GeForce GTX 960M；
TensorFlow-gpu 1.13.0-rc2，CUDA 10.0，Cudnn 7.4.2；
python 3.5.2

安装Tensorflow-gpu版本可以参考博主的另一篇博客：https://blog.csdn.net/kebi199312/article/details/86549637，虽然CUDA的版本不同，但是安装步骤大同小异。

Tensorflow-gpu是在windows PowerShell里用pip安装的，同时安装一些必要的库：cython、easydict、matplotlib、python-opencv等，可直接使用pip安装或者下载相应的.whl离线文件安装。

2、源码下载

Faster RCNN的下载地址：

https://github.com/dBeker/Faster-RCNN-TensorFlow-Python3.5

也可通过git下载，在命令行中打开cmd，cd到你的目录，输入：

git clone https://github.com/dBeker/Faster-RCNN-TensorFlow-Python3.5.git

下载完成后，项目的根目录为：Faster-RCNN-TensorFlow-Python3.5-master

cd到Faster-RCNN-TensorFlow-Python3.5-master\data\coco\PythonAPI目录下，打开cmd，运行编译提供的代码：


 
 
   
   
    
    
   
   
   
   
    
    
     
     python 
     
     setup
     
     .py 
     
     build_ext 
     
     --inplace
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     python 
     
     setup
     
     .py 
     
     build_ext 
     
     install

二、数据集VOC2007下载：

数据集使用的是VOC2007，下载地址：

http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

由于被墙，可以下载百度云盘的数据集，链接：https://pan.baidu.com/s/1Y_RzqLvW4CAzTEq4ICFVUA ，提取码：m9dl

将下载后的三个压缩包解压到同一个文件夹，同时选中这三个压缩包，选择解压到当前文件夹，可得到VOCDevkit文件夹，如图1所示，将VOCDevkit重命名为VOCDevkit2007，然后将这个文件夹复制到data目录下。文件夹目录

Faster-RCNN-TensorFlow-Python3.5-master\data\VOCDevkit2007

图1 数据集VOC2007的文件夹

三、VGG16模型下载：

VGG16模型的下载地址：http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz，也可去百度云盘下载，

链接：https://pan.baidu.com/s/11Ty10NJ-rgXkkvM92SVVKw ，提取码：d2jz

下载完后解压，文件重命名为vgg16.ckpt，如图2所示。新建文件夹imagenet_weights，把vgg16.ckpt放到imagenet_weights下，再将imagenet_weights文件夹复制到data文件夹下。文件夹目录：

Faster-RCNN-TensorFlow-Python3.5-master\data\imagenet_weights\vgg16.ckpt

图2 重命名后的vgg16.ckpt

四、训练模型

训练模型的参数可以在Faster-RCNN-TensorFlow-Python3.5-master\lib\config文件夹里的config.py修改，包括训练的总步数、权重衰减、学习率、batch_size等参数。


 
 
   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_float(
     
     'weight_decay', 
     
     0.0005, 
     
     "Weight decay, for regularization")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_float(
     
     'learning_rate', 
     
     0.001, 
     
     "Learning rate")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_float(
     
     'momentum', 
     
     0.9, 
     
     "Momentum")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_float(
     
     'gamma', 
     
     0.1, 
     
     "Factor for reducing the learning rate")
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'batch_size', 
     
     128, 
     
     "Network batch size during training")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'max_iters', 
     
     40000, 
     
     "Max iteration")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'step_size', 
     
     30000, 
     
     "Step size for reducing the learning rate, currently only support one step")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'display', 
     
     20, 
     
     "Iteration intervals for showing the loss during training, on command line interface")
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_string(
     
     'initializer', 
     
     "truncated", 
     
     "Network initialization parameters")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_string(
     
     'pretrained_model', 
     
     "./data/imagenet_weights/vgg16.ckpt", 
     
     "Pretrained network weights")
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_boolean(
     
     'bias_decay', 
     
     False, 
     
     "Whether to have weight decay on bias as well")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_boolean(
     
     'double_bias', 
     
     True, 
     
     "Whether to double the learning rate for bias")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_boolean(
     
     'use_all_gt', 
     
     True, 
     
     "Whether to use all ground truth bounding boxes for training, "
    
    
   
   

   
   
    
    
   
   
   
   
    
                                                    
     
     "For COCO, setting USE_ALL_GT to False will exclude boxes that are flagged as ''iscrowd''")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'max_size', 
     
     1000, 
     
     "Max pixel size of the longest side of a scaled input image")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'test_max_size', 
     
     1000, 
     
     "Max pixel size of the longest side of a scaled input image")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'ims_per_batch', 
     
     1, 
     
     "Images to use per minibatch")
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     tf.app.flags.DEFINE_integer(
     
     'snapshot_iterations', 
     
     5000, 
     
     "Iteration to take snapshot")

参数调整完后，在Faster-RCNN-TensorFlow-Python3.5-master的目录下，运行 python train.py，就可以训练生成模型了。

模型训练结束后，在 Faster-RCNN-TensorFlow-Python3.5-master\default\voc_2007_trainval\default目录下可以看到训练的模型，一个迭代了40000次，迭代次数可在Faster-RCNN-TensorFlow-Python3.5-master\lib\config文件夹里的config.py修改。

在目录下新建output\vgg16\voc_2007_trainval\default文件，将训练生成的文件复制到该文件下，并改名如下:“vgg16.ckpt.meta”，如图4所示：

图4 改名后的vgg16.ckpt.meta

五、测试模型

对demo.py进行如下的修改

1、将NETS中的“vgg16_faster_rcnn_iter_70000.ckpt”改成“vgg16”，如下所示；

NETS = {'vgg16': ('vgg16.ckpt',), 'res101': ('res101_faster_rcnn_iter_110000.ckpt',)}

2、将DATASETS中的“voc_2007_trainval+voc_2012_trainval”改为“voc_2007_trainval”，如下所示；

DATASETS = {'pascal_voc': ('voc_2007_trainval',), 'pascal_voc_0712': ('voc_2007_trainval',)}

3、将def parse_args()函数的两个default分别改成vgg16和pascal_voc，如下所示；


 
 
   
   
    
    
   
   
   
   
    
    
     
     def parse_args():
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     """Parse input arguments."""
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         parser = argparse.ArgumentParser(description=
     
     'Tensorflow Faster R-CNN demo')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         parser.add_argument(
     
     '--net', dest=
     
     'demo_net', help=
     
     'Network to use [vgg16 res101]',
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                             choices=NETS.keys(), default=
     
     'vgg16')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         parser.add_argument(
     
     '--dataset', dest=
     
     'dataset', help=
     
     'Trained dataset [pascal_voc pascal_voc_0712]',
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                             choices=DATASETS.keys(), default=
     
     'pascal_voc')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         args = parser.parse_args()
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return args

4、修改上述参数后，运行demo.py，出现错误：


 
 
   
   
    
    
   
   
   
   
    
    
     
     E:\Software\Python\python.exe E:/liukang/Faster-RCNN-TensorFlow-Python3.5-master/demo.py 
     
     --net vgg16
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     Traceback (most recent 
     
     call 
     
     last):
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     File 
     
     "E:/liukang/Faster-RCNN-TensorFlow-Python3.5-master/demo.py", line 
     
     142, 
     
     in <
     
     module>
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         tag=
     
     'default', anchor_scales=[
     
     8, 
     
     16, 
     
     32])
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     File 
     
     "E:\liukang\Faster-RCNN-TensorFlow-Python3.5-master\lib\nets\network.py", line 
     
     283, 
     
     in create_architecture
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         weights_regularizer = tf.contrib.layers.l2_regularizer(cfg.FLAGS.weight_decay)
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     File 
     
     "E:\Software\Python\lib\site-packages\tensorflow\python\platform\flags.py", line 
     
     84, 
     
     in __getattr__
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     wrapped(_sys.argv)
    
    
   
   

   
   
    
    
   
   
   
   
    
      
     
     File 
     
     "E:\Software\Python\lib\site-packages\absl\flags\_flagvalues.py", line 
     
     633, 
     
     in __call__
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     name, 
     
     value, suggestions=suggestions)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     absl.flags._exceptions.UnrecognizedFlagError: 
     
     Unknown command line flag 
     
     'net'. Did you mean: network ?

解决方法：

新建一个py文件，把demo.py脚本内容复制到里面就好了；新建一个脚本temp.py，测试多张图片，运行的结果如图5所示：

图5 运行demo.py的结果

测试视频，代码如下，截取视频的一张图片如图6所示：


 
 
   
   
    
    
   
   
   
   
    
    
     
     # -*- coding: utf-8 -*-
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     # --------------------------------------------------------
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     # Faster R-CNN
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     #author lk
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     # --------------------------------------------------------
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     """
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     Demo script showing detections in videos.
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     """
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from __future__ 
     
     import absolute_import
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from __future__ 
     
     import division
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from __future__ 
     
     import print_function
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import argparse
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import os
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import cv2
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import tensorflow 
     
     as tf
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from lib.config 
     
     import config 
     
     as cfg
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from lib.utils.test 
     
     import im_detect
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from lib.utils.nms_wrapper 
     
     import nms
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from lib.utils.timer 
     
     import Timer
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     from lib.nets.vgg16 
     
     import vgg16
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import matplotlib.pyplot 
     
     as plt
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import numpy 
     
     as np
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import sys
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     import time
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     (major_ver, minor_ver, subminor_ver) = (cv2.__version__).split(
     
     '.')
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     CLASSES = (
     
     '__background__',
    
    
   
   

   
   
    
    
   
   
   
   
    
               
     
     'aeroplane', 
     
     'bicycle', 
     
     'bird', 
     
     'boat',
    
    
   
   

   
   
    
    
   
   
   
   
    
               
     
     'bottle', 
     
     'bus', 
     
     'car', 
     
     'cat', 
     
     'chair',
    
    
   
   

   
   
    
    
   
   
   
   
    
               
     
     'cow', 
     
     'diningtable', 
     
     'dog', 
     
     'horse',
    
    
   
   

   
   
    
    
   
   
   
   
    
               
     
     'motorbike', 
     
     'person', 
     
     'pottedplant',
    
    
   
   

   
   
    
    
   
   
   
   
    
               
     
     'sheep', 
     
     'sofa', 
     
     'train', 
     
     'tvmonitor')
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     NETS = {
     
     'vgg16': (
     
     'vgg16.ckpt',), 
     
     'res101': (
     
     'res101_faster_rcnn_iter_110000.ckpt',)}
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     DATASETS = {
     
     'pascal_voc': (
     
     'voc_2007_trainval',), 
     
     'pascal_voc_0712': (
     
     'voc_2007_trainval',)}
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     def vis_detections(im, class_name, dets, thresh=0.5):
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     """Draw detected bounding boxes."""
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         inds = np.where(dets[:, 
     
     -1] >= thresh)[
     
     0]
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if len(inds) == 
     
     0:
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     return
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         im = im[:, :, (
     
     2, 
     
     1, 
     
     0)]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         fig, ax = plt.subplots(figsize=(
     
     12, 
     
     12))
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         ax.imshow(im, aspect=
     
     'equal')
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     for i 
     
     in inds:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             bbox = dets[i, :
     
     4]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             score = dets[i, 
     
     -1]
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             ax.add_patch(
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                 plt.Rectangle((bbox[
     
     0], bbox[
     
     1]),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                               bbox[
     
     2] - bbox[
     
     0],
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                               bbox[
     
     3] - bbox[
     
     1], fill=
     
     False,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                               edgecolor=
     
     'red', linewidth=
     
     3.5)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             )
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             ax.text(bbox[
     
     0], bbox[
     
     1] - 
     
     2,
    
    
   
   

   
   
    
    
   
   
   
   
    
                    
     
     '{:s} {:.3f}'.format(class_name, score),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                     bbox=dict(facecolor=
     
     'blue', alpha=
     
     0.5),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                     fontsize=
     
     14, color=
     
     'white')
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         ax.set_title((
     
     '{} detections with '
    
    
   
   

   
   
    
    
   
   
   
   
    
                      
     
     'p({} | box) >= {:.1f}').format(class_name, class_name,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                                       thresh),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                      fontsize=
     
     14)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         plt.axis(
     
     'off')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         plt.tight_layout()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         plt.draw()
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     def vis_detections_video(im, class_name, dets, thresh=0.5):
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     """Draw detected bounding boxes."""
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     #np.where判断语句
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         inds = np.where(dets[:, 
     
     -1] >= thresh)[
     
     0]
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if len(inds) == 
     
     0:
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     return im
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     for i 
     
     in inds:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             bbox = dets[i, :
     
     4]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             score = dets[i, 
     
     -1]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cv2.rectangle(im, (bbox[
     
     0], bbox[
     
     1]), (bbox[
     
     2], bbox[
     
     3]), (
     
     0, 
     
     0, 
     
     255), 
     
     2)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cv2.rectangle(im, (int(bbox[
     
     0]), int(bbox[
     
     1] - 
     
     20)), (int(bbox[
     
     0] + 
     
     200), int(bbox[
     
     1])), (
     
     10, 
     
     10, 
     
     10), 
     
     -1)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cv2.putText(im, 
     
     '{:s} {:.3f}'.format(class_name, score), (int(bbox[
     
     0]), int(bbox[
     
     1] - 
     
     2)),
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                         cv2.FONT_HERSHEY_SIMPLEX, 
     
     .75, (
     
     0, 
     
     0, 
     
     255))
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return im
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     def demo(net, im):
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     """Detect object classes in an image using pre-computed object proposals."""
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     global frameRate
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     global fps
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Detect all object classes and regress object bounds
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         timer = Timer()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         timer.tic()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         scores, boxes = im_detect(sess,net, im)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         timer.toc()
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         print(
     
     'Detection took {:.3f}s for '
    
    
   
   

   
   
    
    
   
   
   
   
    
              
     
     '{:d} object proposals'.format(timer.total_time, boxes.shape[
     
     0]))
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         frameRate = 
     
     1.0 / timer.total_time
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         print(
     
     'fps:'+str(float(frameRate)))
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Visualize detections for each class
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         CONF_THRESH = 
     
     0.8
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         NMS_THRESH = 
     
     0.3
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     for cls_ind, cls 
     
     in enumerate(CLASSES[
     
     1:]):
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     # because we skipped background
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cls_ind += 
     
     1
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cls_boxes = boxes[:, 
     
     4 * cls_ind:
     
     4 * (cls_ind + 
     
     1)]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cls_scores = scores[:, cls_ind]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             dets = np.hstack((cls_boxes,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                               cls_scores[:, np.newaxis])).astype(np.float32)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             keep = nms(dets, NMS_THRESH)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             dets = dets[keep, :]
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             vis_detections_video(im, cls, dets, thresh=CONF_THRESH)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             text=
     
     '{:s} {:.2f}'.format(
     
     "FPS:", frameRate)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             position=(
     
     50, 
     
     50)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cv2.putText(im, text, position, cv2.FONT_HERSHEY_SIMPLEX, 
     
     1, (
     
     0, 
     
     0, 
     
     255))
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cv2.imshow(videoFilePath.split(
     
     '/')[len(videoFilePath.split(
     
     '/')) - 
     
     1], im)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             cv2.waitKey(
     
     50)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     def parse_args():
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     """Parse input arguments."""
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         parser = argparse.ArgumentParser(description=
     
     'Tensorflow Faster R-CNN demo')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         parser.add_argument(
     
     '--net', dest=
     
     'demo_net', help=
     
     'Network to use [vgg16 res101]',
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                             choices=NETS.keys(), default=
     
     'vgg16')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         parser.add_argument(
     
     '--dataset', dest=
     
     'dataset', help=
     
     'Trained dataset [pascal_voc pascal_voc_0712]',
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                             choices=DATASETS.keys(), default=
     
     'pascal_voc')
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         args = parser.parse_args()
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     return args
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
     if __name__ == 
     
     '__main__':
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         args = parse_args()
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # model path
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         demonet = args.demo_net
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         dataset = args.dataset
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         tfmodel = os.path.join(
     
     'output', demonet, DATASETS[dataset][
     
     0], 
     
     'default', NETS[demonet][
     
     0])
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if 
     
     not os.path.isfile(tfmodel + 
     
     '.meta'):
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             print(tfmodel)
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     raise IOError((
     
     '{:s} not found.\nDid you download the proper networks from '
    
    
   
   

   
   
    
    
   
   
   
   
    
                           
     
     'our server and place them properly?').format(tfmodel + 
     
     '.meta'))
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # set config
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         tfconfig = tf.ConfigProto(allow_soft_placement=
     
     True)
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         tfconfig.gpu_options.allow_growth = 
     
     True
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # load network
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     if demonet == 
     
     'vgg16':
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             net = vgg16(batch_size=
     
     1)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     else:
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     raise NotImplementedError
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # init session
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         sess = tf.Session(config=tfconfig)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         net.create_architecture(sess, 
     
     "TEST", 
     
     21,
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
                                 tag=
     
     'default', anchor_scales=[
     
     8, 
     
     16, 
     
     32])
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         saver = tf.train.Saver()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         saver.restore(sess, tfmodel)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         print(
     
     '\n\nLoaded network {:s}'.format(tfmodel))
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     # Warmup on a dummy image
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         im = 
     
     128 * np.ones((
     
     300, 
     
     500, 
     
     3), dtype=np.uint8)
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     for i 
     
     in range(
     
     2):
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             _, _ = im_detect(sess,net, im)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         videoFilePath = 
     
     'Camera Road 01.avi'
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         videoCapture = cv2.VideoCapture(videoFilePath)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
        
     
     while 
     
     True:
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             success, im = videoCapture.read()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
             demo(net, im)
    
    
   
   

   
   
    
    
   
   
   
   
    
     
    
    
   
   

   
   
    
    
   
   
   
   
    
            
     
     if cv2.waitKey(
     
     10) & 
     
     0xFF == ord(
     
     'q'):
    
    
   
   

   
   
    
    
   
   
   
   
    
                
     
     break
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         videoCapture.release()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         cv2.destroyAllWindows()
    
    
   
   

   
   
    
    
   
   
   
   
    
    
     
         sess.close()