Mask Rcnn tensorflow(keras前端)模型 c++预测 windows系统

captain_richard

已于 2022-03-05 17:43:29 修改

阅读量3.9k

点赞数 8

文章标签： mask rcnn tensorflow c++ keras

于 2019-04-13 22:04:11 首次发布

本文链接：https://blog.csdn.net/qq_33671888/article/details/89254537

版权

<1> 背景:

先介绍写这篇博客的目的,因为本人是个gayhub搬运工,在搜索如标题的代码发现好难找得到,而且几乎好难找到,找了好久,找了好多大佬的代码,再加上本人的辣鸡代码(自己都看不下去)终于跑通了如标题所示功能，虽然本人的代码有点辣鸡，但是想用来跑一跑还是ok的。由于本人是第一次接触tensorflow(pytorch,caffe脑残粉)，keras只会调用，加上看得的论文不多，可能有些理解不到位，所以代码写的很辣鸡,请多包涵,因为毕竟自己是搬运了很多大佬的工作才能完成,所以也希望能帮上其他人，毕竟踩坑踩多了很累。。。。

<2>首先说下里面涉及到的操作：

1.keras模型->tensorflow。因为模型是keras训练的(gayhub:GitHub - matterport/Mask_RCNN: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow),c++调用keras模型有人做，但当时做的时候感觉很多坑，所以我的做法还是倾向keras转tensorflow，再用c++调用

2.c++调用mask rcnn tensorflow模型,主要涉及到输入tensor和输出tensor的操作，还有矩阵操作库Eigen3(这个矩阵库用好了对以后很有帮助)的操作

3.注意事项,这里面用的tensorflow c++库是gpu还是cpu，以及模型batchsize的大小，都要跟你用python tensorflow训练好后保存的模型时用的batchsize以及是gpu还是cpu等操作保持一致。比如你训练好后保存的模型batchsize是32,那么用c++推理时就用batchsize=32

<3>环境配置:

1 系统:windows64位(很少在windows下编程)，最好用64位的，因为用的tensorflow1.8好像只支持64位系统，不知道其他的版本怎样。编译器是使用msvc 2015 x64(感觉这个好难用。因为以前在linux下写程序多,一下子转不过来。。) ，IDE是qt4.8

2.安装好protobuf库，这一步主要是为了能够配置gpu,比如选择显卡号，如何使用显存等操作，如果没有这个库会导致出错，显示没有tensorflow::**protobuf**（具体忘了什么名字）的错误,如果你没有使用到前面描述的操作就可以不用安装。（我安装的是protobuf 3.6.1版本的,之前试了几个版本都安装出错,也是挺坑的,如果我编译好的库不能使用，请自行搜索安装

3.安装好显卡驱动和cuda（我的是9.1），正如我刚才所说,我身为一个汽车维修员,有一个锤子在身边,也很合逻辑。。走错片场，，这一步主要是我使用的是gpu版本的tensorflow库,并且涉及到配置显卡，如果你使用的是cpu版本的tensorflow就不用进行2，3步了

4.因为我用的是gpu版本的tensorflow库，大佬编译好的(GitHub - fo40225/tensorflow-windows-wheel: Tensorflow prebuilt binary for Windows)我的是1.8.0 avx2 gpu

5.opencv,版本3.3.0(这个自己安装应该没问题)，主要是测试读图,以及从mat转到tensor时用到，不过如果你要读取超大图像,比如数字病理图像，推荐大家使用libvips(这个库很强大),还有openslide（这个用python操作还不错,c++的没试过,因为编译看起来有很多坑的样子）

6.linux系统安装好tensorflow和keras前端,因为训练一般是在服务器上的linux系统.这一步主要给keras转tensorflow模型用

<4>开始踩坑

在训练模型的机器上把keras模型转tensorflow模型：
下载并按照提示安装使用keras前端的Mask rcnn GitHub - matterport/Mask_RCNN: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow,以及转换keras模型到tensorflow模型用的 GitHub - parai/Mask_RCNN: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

保存训练好的模型:先修改matterport的Mask_RCNN/samples/coco/coco.py,主要是三步

1
2 修改inference config配置,这一步很重要,比如GPU_COUNT 和IMAGES_PER_GPU这两个参数必须与parai的Mask_RCNN-master/samples/demo.py的CocoConfig一致,否则会出错，IMAGES_PER_GPU这个参数涉及到预测时图片的张数,我的是32张，是批预测，显卡是1080ti测了下512*512的图片可以跑32张
3 修改类别数目
4执行coco.py文件，这样就保存为keras模型(模型+权重)，我的文件名是mask_rcnn_whole_batch32_new20.h5

下面是我的CocoConfig，没有大改，主要是IMAGE_MIN_DIM = 512 和IMAGE_MAX_DIM = 512

class CocoConfig(Config):
    """Configuration for training on MS COCO.
    Derives from the base Config class and overrides values specific
    to the COCO dataset.
    """
    # Give the configuration a recognizable name
    NAME = "coco"
#
#    # We use a GPU with 12GB memory, which can fit two images.
#    # Adjust down if you use a smaller GPU.
#    IMAGES_PER_GPU = 2

    # Uncomment to train on 8 GPUs (default is 1)
    # GPU_COUNT = 8

    # Number of classes (including background)
    #NUM_CLASSES = 1 + 6 # 6 # COCO has 80 classes
    NUM_CLASSES = 1 + 20 # 6 # COCO has 80 classes

    # NUMBER OF GPUs to use. For CPU training, use 1
    GPU_COUNT = 1

    # Number of images to train with on each GPU. A 12GB GPU can typically
    # handle 2 images of 1024x1024px.
    # Adjust based on your GPU memory and image sizes. Use the highest
    # number that your GPU can handle for best performance.
    IMAGES_PER_GPU = 2

    # Number of training steps per epoch
    # This doesn't need to match the size of the training set. Tensorboard
    # updates are saved at the end of each epoch, so setting this to a
    # smaller number means getting more frequent TensorBoard updates.
    # Validation stats are also calculated at each epoch end and they
    # might take a while, so don't set this too small to avoid spending
    # a lot of time on validation stats.
    STEPS_PER_EPOCH = 2000# 16962

    # Number of validation steps to run at the end of every training epoch.
    # A bigger number improves accuracy of validation stats, but slows
    # down the training.
    VALIDATION_STEPS = 4241

    # Backbone network architecture
    # Supported values are: resnet50, resnet101.
    # You can also provide a callable that should have the signature
    # of model.resnet_graph. If you do so, you need to supply a callable
    # to COMPUTE_BACKBONE_SHAPE as well
    BACKBONE = "resnet101"

    # Only useful if you supply a callable to BACKBONE. Should compute
    # the shape of each layer of the FPN Pyramid.
    # See model.compute_backbone_shapes
    COMPUTE_BACKBONE_SHAPE = None

    # The strides of each layer of the FPN Pyramid. These values
    # are based on a Resnet101 backbone.
    BACKBONE_STRIDES = [4, 8, 16, 32, 64]

    # Size of the fully-connected layers in the classification graph
    FPN_CLASSIF_FC_LAYERS_SIZE = 1024

    # Size of the top-down layers used to build the feature pyramid
    TOP_DOWN_PYRAMID_SIZE = 256

    # Length of square anchor side in pixels
    RPN_ANCHOR_SCALES = (32, 64, 128, 256, 512)

    # Ratios of anchors at each cell (width/height)
    # A value of 1 represents a square anchor, and 0.5 is a wide anchor
    RPN_ANCHOR_RATIOS = [0.5, 1, 2]

    # Anchor stride
    # If 1 then anchors are created for each cell in the backbone feature map.
    # If 2, then anchors are created for every other cell, and so on.
    RPN_ANCHOR_STRIDE = 1

    # Non-max suppression threshold to filter RPN proposals.
    # You can increase this during training to generate more propsals.
    RPN_NMS_THRESHOLD = 0.8

    # How many anchors per image to use for RPN training
    RPN_TRAIN_ANCHORS_PER_IMAGE = 256
    
    # ROIs kept after tf.nn.top_k and before non-maximum suppression
    PRE_NMS_LIMIT = 2000
    
    # ROIs kept after non-maximum suppression (training and inference)
    POST_NMS_ROIS_TRAINING = 2000
    POST_NMS_ROIS_INFERENCE = 1000

    # If enabled, resizes instance masks to a smaller size to reduce
    # memory load. Recommended when using high-resolution images.
    USE_MINI_MASK = True
    MINI_MASK_SHAPE = (56, 56)  # (height, width) of the mini-mask

    # Input image resizing
    # Generally, use the "square" resizing mode for training and predicting
    # and it should work well in most cases. In this mode, images are scaled
    # up such that the small side is = IMAGE_MIN_DIM, but ensuring that the
    # scaling doesn't make the long side > IMAGE_MAX_DIM. Then the image is
    # padded with zeros to make it a square so multiple images can be put
    # in one batch.
    # Available resizing modes:
    # none:   No resizing or padding. Return the image unchanged.
    # square: Resize and pad with zeros to get a square image
    #         of size [max_dim, max_dim].
    # pad64:  Pads width and height with zeros to make them multiples of 64.
    #         If IMAGE_MIN_DIM or IMAGE_MIN_SCALE are not None, then it scales
    #         up before padding. IMAGE_MAX_DIM is ignored in this mode.
    #         The multiple of 64 is needed to ensure smooth scaling of feature
    #         maps up and down the 6 levels of the FPN pyramid (2**6=64).
    # crop:   Picks random crops from the image. First, scales the image based
    #         on IMAGE_MIN_DIM and IMAGE_MIN_SCALE, then picks a random crop of
    #         size IMAGE_MIN_DIM x IMAGE_MIN_DIM. Can be used in training only.
    #         IMAGE_MAX_DIM is not used in this mode.
    IMAGE_RESIZE_MODE = "square"
    IMAGE_MIN_DIM = 512 #我的图片是512*512
    IMAGE_MAX_DIM = 512
    # Minimum scaling ratio. Checked after MIN_IMAGE_DIM and can force further
    # up scaling. For example, if set to 2 then images are scaled up to double
    # the width and height, or more, even if MIN_IMAGE_DIM doesn't require it.
    # Howver, in 'square' mode, it can be overruled by IMAGE_MAX_DIM.
    IMAGE_MIN_SCALE = 0

    # Image mean (RGB)
    MEAN_PIXEL = np.array([123.7, 116.8, 103.9])

    # Number of ROIs per image to feed to classifier/mask heads
    # The Mask RCNN paper uses 512 but often the RPN doesn't generate
    # enough positive proposals to fill this and keep a positive:negative
    # ratio of 1:3. You can increase the number of proposals by adjusting
    # the RPN NMS threshold.
    TRAIN_ROIS_PER_IMAGE = 200

    # Percent of positive ROIs used to train classifier/mask heads
    ROI_POSITIVE_RATIO = 0.33

    # Pooled ROIs
    POOL_SIZE = 7
    MASK_POOL_SIZE = 14

    # Shape of output mask
    # To change this you also need to change the neural network mask branch
    MASK_SHAPE = [28, 28]

    # Maximum number of ground truth instances to use in one image
    MAX_GT_INSTANCES = 100

    # Bounding box refinement standard deviation for RPN and final detections.
    RPN_BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])
    BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])

    # Max number of final detections
    DETECTION_MAX_INSTANCES = 100

    # Minimum probability value to accept a detected instance
    # ROIs below this threshold are skipped
    DETECTION_MIN_CONFIDENCE = 0.8

    # Non-maximum suppression threshold for detection
    DETECTION_NMS_THRESHOLD = 0.3

    # Learning rate and momentum
    # The Mask RCNN paper uses lr=0.02, but on TensorFlow it causes
    # weights to explode. Likely due to differences in optimizer
    # implementation.
    LEARNING_RATE = 0.001
    LEARNING_MOMENTUM = 0.9

    # Weight decay regularization
    WEIGHT_DECAY = 0.0001

    # Loss weights for more precise optimization.
    # Can be used for R-CNN training setup.
    LOSS_WEIGHTS = {
        "rpn_class_loss": 1.,
        "rpn_bbox_loss": 1.,
        "mrcnn_class_loss": 1.,
        "mrcnn_bbox_loss": 1.,
        "mrcnn_mask_loss": 1.
    }

    # Use RPN ROIs or externally generated ROIs for training
    # Keep this True for most situations. Set to False if you want to train
    # the head branches on ROI generated by code rather than the ROIs from
    # the RPN. For example, to debug the classifier head without having to
    # train the RPN.
    USE_RPN_ROIS = True

    # Train or freeze batch normalization layers
    #     None: Train BN layers. This is the normal mode
    #     False: Freeze BN layers. Good when using a small batch size
    #     True: (don't use). Set layer in training mode even when predicting
    TRAIN_BN = False  # Defaulting to False since batch size is often small

    # Gradient norm clipping
    GRADIENT_CLIP_NORM = 5.0

keras模型转tensorflow模型，主要是让parai的Mask_RCNN-master/samples/demo.py的几个参数与matterport的Mask_RCNN/samples/coco/coco.py的参数一致
- 修改parai的Mask_RCNN-master/samples/demo.py的CocoConfig里的类别数目和图像尺寸
- 修改parai的Mask_RCNN-master/samples/demo.py的InferenceConfig以及一些路径
- 修改parai的Mask_RCNN-master/scripts/export_model.py文件
- 最后执行parai的Mask_RCNN-master/samples/demo.py会得到转换好的tensorflow模型，我的是mask_rcnn_batch32_new20.pb，测试pb模型可以用parai的Mask_RCNN-master/infere_from_pb.py进行测试，，到此第一个阶段就完成了

c++调用 mask rcnn tensorflow模型,这一步主要是对tensorflow::tensor以及Eigen::tensor(eigen3)等一系列操作

先来看看mask rcnn输入用到的tensor,从parai的Mask_RCNN-master/infere_from_pb.py即下图可以看到主要对应三个输入,分别是img_ph,img_anchors_ph,img_meta_ph这个三个键，它们的值分别是molded_images, image_metas, image_anchors，，,对应的tensorflow::tensor名称分别是input_image_1,input_anchors_1,input_image_meta_1,那么我们就要c++构建这三个tensorflow::tensor，可以根据infere_from_pb.py的def mold_inputs(images) 来查看molded_images, image_metas这两个值如何来的，根据def get_anchors(image_shape, config):函数查看image_anchors如何来的, ，注意以后对tensor的操作加上'tensorflow::'这个域操作符号,因为eigen3里面也有个tensor，防止混淆

构建input_image_1的tensorflow::tensor，input_image_1这个键对应的是值molded_images,molded_images就是把图像转换成tensorflow::tensor，这一步是cv::Mat->tensor,因为我用的是批预测，所以图片是存在cv::Mat容器里面的,但是不是每个批次都是满的,所以多了个imgNum_actual 参数控制实际中要转换的cv::mat参数

void detectBatch::CVMats_to_Tensor(std::vector<cv::Mat> &imgs, tensorflow::Tensor *input_tensor, size_t &imgNum_actual)
{
    /*
        *Function:  CVMats_to_Tensor
        *Description:  cv::mat图像容器转到tensorflow::tensor
        *Calls:
            1. ****
        *Called By:
          1. ****

        *InputList:
          1. imgs 存储cv::mat的图像容器 std::vector<cv::Mat> &
          2. input_tensor 要存储数据的tensor tensorflow::Tensor *
          3. imgNum_actual 实际要转换cv::mat张数 size_t &

        *OutPut:
          1. NULL
    */

    auto outputMap =input_tensor->tensor<float,4>();//获取tensor指针,注意这里outputMap是Eigen::tensor类型
    for(size_t b=0;b<imgNum_actual;b++)//遍历图像张数
    {

        for(int r=0;r<outputMap.dimension(1);r++)//遍历行数
        {
            for(int c=0;c<outputMap.dimension(2);c++)//遍历列数
            {
                //note that opencv mat image channel is B G R
                //减去均值
                outputMap(b,r,c,0)=imgs[b].at<cv::Vec3b>(r,c)[2]-MEAN_PIXEL[0];//R
                outputMap(b,r,c,1)=imgs[b].at<cv::Vec3b>(r,c)[1]-MEAN_PIXEL[1];//G
                outputMap(b,r,c,2)=imgs[b].at<cv::Vec3b>(r,c)[0]-MEAN_PIXEL[2];//B
            }

        }
    }

}

构建input_image_meta_1 的tensorflow::tensor，先来看看input_image_meta_1是什么东西,它对应infere_from_pb.py中的img_meta_ph这个键,而值是image_metas，可以从def mold_inputs(images)函数中看到image_metas只是对图像数据信息的包装,比如长宽通道数等等。可以看到image_metas数据就是N*(length of meta data)的二维列表，第一个维度N是预测的batch_size,然后第二个维度是单张图像meta的数据，因为送进去batch里面的图像都是尺寸一样的，所以我们只需构建一个meta数据就行了，其他直接复制后构成一个N*(length of meta data)二维列表就行了

def mold_inputs(images):
        """Takes a list of images and modifies them to the format expected
        as an input to the neural network.
        images: List of image matricies [height,width,depth]. Images can have
            different sizes.
        Returns 3 Numpy matricies:
        molded_images: [N, h, w, 3]. Images resized and normalized.
        image_metas: [N, length of meta data]. Details about each image.#可以看到image_metas数据就是N*(length of meta data)的列表，N是预测的batch_size,
        windows: [N, (y1, x1, y2, x2)]. The portion of the image that has the
            original image (padding excluded).
        """
        molded_images = []
        image_metas = []
        windows = []
        for image in images:
            # Resize image to fit the model expected size
            # TODO: move resizing to mold_image()
            molded_image, window, scale, padding, corp = utils.resize_image(
                image,
                min_dim=inference_config.IMAGE_MIN_DIM,
                min_scale=inference_config.IMAGE_MIN_SCALE,
                max_dim=inference_config.IMAGE_MAX_DIM,
                mode=inference_config.IMAGE_RESIZE_MODE)

            print(image.shape)
            print('Image resized at: ', molded_image.shape)
            print(window)
            print(scale)
            """Takes RGB images with 0-255 values and subtraces
                   the mean pixel and converts it to float. Expects image
                   colors in RGB order."""
            molded_image = mold_image(molded_image, inference_config)
            print('Image molded')
            #print(a)
            """Takes attributes of an image and puts them in one 1D array."""
            inference_config.NUM_CLASSES = 81
            #下面这个函数开始构造image_meta,我们仿造这个函数构建就行了
            image_meta = compose_image_meta( 
                0, image.shape, molded_image.shape, window, scale,
                np.zeros([inference_config.NUM_CLASSES], dtype=np.int32))
            print('Meta of image prepared')
            image_anchor = [] # TODO
            # Append
            molded_images.append(molded_image)
            windows.append(window)
            image_metas.append(image_meta)
        # Pack into arrays
        molded_images = np.stack(molded_images)
        image_metas = np.stack(image_metas)
        windows = np.stack(windows)
        return molded_images, image_metas, windows

可以看到上面的函数里面真正起作用的是compose_image_meta这个函数，我们可以看一下它的实现

def compose_image_meta(image_id, original_image_shape, image_shape,
                       window, scale, active_class_ids):
    """Takes attributes of an image and puts them in one 1D array.

    image_id: An int ID of the image. Useful for debugging.
    original_image_shape: [H, W, C] before resizing or padding.
    image_shape: [H, W, C] after resizing and padding
    window: (y1, x1, y2, x2) in pixels. The area of the image where the real
            image is (excluding the padding)
    scale: The scaling factor applied to the original image (float32)
    active_class_ids: List of class_ids available in the dataset from which
        the image came. Useful if training on images from multiple datasets
        where not all classes are present in all datasets.
    """
    meta = np.array(
        [image_id] +                  # size=1
        list(original_image_shape) +  # size=3
        list(image_shape) +           # size=3
        list(window) +                # size=4 (y1, x1, y2, x2) in image cooredinates
        [scale] +                     # size=1
        list(active_class_ids)        # size=num_classes
    )
    return meta

可以看到meta数据结构如下
[image_id] + # size=1 长度为1，值的话可以赋值为0
list(original_image_shape) + # size=3 长度为3，主要是原始图像的h,w,c这个三个参数
list(image_shape) + # size=3 长度为3，主要是resize后图像的h,w,c这个三个参数，注意因为我是裁剪好送进去，所以实际上这里original_image_shape和image_shape是一样的
list(window) + # size=4 (y1, x1, y2, x2) in image cooredinates 窗口的坐标,就是显示用的，x1,y1都是0,x2,y2是窗口的大小,但因为是裁剪好送进去，所以x2,y2和图像的h,w一样，这里偷懒了，我不想预测得到坐标后还得经过计算映射到窗口或原图
[scale] + # size=1 缩放比例，主要是resize后图像的长边/resize前图像的长边，如上所诉因为是裁剪好的，resize前后的图像一致，所以这里其实是比例是1
list(active_class_ids) #类别(包括背景)构成列表，里面的元素根据def mold_inputs(images)的操作全部赋值为0
那么构成的c++代码如下

void detectBatch::compose_image_meta()
{
    /*
        *Function:  compose_image_meta
        *Description:  计算图像meta数据
        *Calls:
            1. ****
        *Called By:
          1. ****

        *InputList:
          1. NULL

        *OutPut:
          1. NULL
    */
    int imglongSide,inputlongSide;
    image_meta[0]=0;
    //original_image_shape: [H, W, C] before resizing or padding.
    image_meta[1]=inputImg_h;
    image_meta[2]=inputImg_w;
    image_meta[3]=inputImg_c;
    imglongSide=image_meta[1]>=image_meta[2]?image_meta[1]:image_meta[2];


    //image_shape: [H, W, C] after resizing and padding
    image_meta[4]=input_height;
    image_meta[5]=input_width;
    image_meta[6]=input_channels;
    inputlongSide=image_meta[4]>=image_meta[5]?image_meta[4]:image_meta[5];

    //window: (y1, x1, y2, x2) in pixels. The area of the image where the real image is (excluding the padding)
    image_meta[7]=0;
    image_meta[8]=0;
    image_meta[9]=input_height;//因为我的图像都是裁剪好再送进去的,所以窗口的长宽与实际图像长宽一致
    image_meta[10]=input_width;

    //scale: The scaling factor applied to the original image (float32)
    image_meta[11]=inputlongSide/imglongSide;

    //active_class_ids: List of class_ids available in the dataset from which the image came.
    for(int i=TF_MASKRCNN_IMAGE_METADATA_LENGTH-num_classes;i<TF_MASKRCNN_IMAGE_METADATA_LENGTH;i++)
    {
        image_meta[i]=0;
    }

    inputMetadataTensor=tensorflow::Tensor(tensorflow::DT_FLOAT, {batch_size, TF_MASKRCNN_IMAGE_METADATA_LENGTH});

    auto inputMetadataTensorMap=inputMetadataTensor.tensor<float,2>();
    for(int j=0;j<batch_size;j++)
    {
        for(int i=0;i<TF_MASKRCNN_IMAGE_METADATA_LENGTH;i++)
        {
            //std::cout<<"image_meta["<<i<<"] is "<<image_meta[i]<<std::endl;
            inputMetadataTensorMap(j,i)=image_meta[i];

        }
    }

}

构建input_anchors_1的tensorflow::tensor，先来看看input_anchors_1是什么东西,它对应infere_from_pb.py中的img_anchors_ph这个键,而值是image_anchors，可以从def get_anchors(image_shape, config)函数中看到image_anchors如何来的。这一步是最复杂的,同时也是最重要的，了解这一部分的代码对mask rcnn的理解很有帮助。

可以从infere_from_pb.py看def get_anchors(image_shape, config)函数

def get_anchors(image_shape, config):
    """Returns anchor pyramid for the given image size."""
    backbone_shapes = compute_backbone_shapes(config, image_shape)#计算
#输入图像经过backbone的每一个阶段(可能是pooling或者conv等down sample操作导
#致feature map缩小后的尺寸,这一部分没细看)后feature图的形状(长宽)
    # Cache anchors and reuse if image shape is the same
    _anchor_cache = {}
    if not tuple(image_shape) in _anchor_cache:#先判断，如果之前计算
#过同样尺寸图像的的anchor，就不用重新计算，直接取_anchor_cache里面存储好的
#上一次的anchor就行，感觉作者这个操作很细腻，以前我都没想过这种操作,不过我的
#输入图像是因为事先裁剪好的图像，即我的图像都是512*512的，所以我这边只要计算
#好anchor后，后面不用判断也不用再次计算，这样感觉再处理超大图像时可以省时间，
#不过我没测试过。。
        # Generate Anchors
        a = utils.generate_pyramid_anchors(
            config.RPN_ANCHOR_SCALES,
            config.RPN_ANCHOR_RATIOS,
            backbone_shapes,
            config.BACKBONE_STRIDES,
            config.RPN_ANCHOR_STRIDE)
        # Keep a copy of the latest anchors in pixel coordinates because
        # it's used in inspect_model notebooks.
        # TODO: Remove this after the notebook are refactored to not use it
        anchors = a
        # Normalize coordinates
        _anchor_cache[tuple(image_shape)] = utils.norm_boxes(a, image_shape[:2])
    return _anchor_cache[tuple(image_shape)]

从上的代码可以看出构成anchors主要是utils.generate_pyramid_anchors这个函数，里面的五个参数：
主要看backbone_shapes这个参数以及utils.generate_pyramid_anchors函数本身，其余四个参数在class CocoConfig(Config)里面可以看到只是简单的数组列表

第一先看backbone_shapes怎么来的，它是从 compute_backbone_shapes(config, image_shape)得来的，代码如下
```
def compute_backbone_shapes(config, image_shape):
    """Computes the width and height of each stage of the backbone network.
    
    Returns:
        [N, (height, width)]. Where N is the number of stages
    """
    # Currently supports ResNet only
    assert config.BACKBONE in ["resnet50", "resnet101"]
    return np.array(
        [[int(math.ceil(image_shape[0] / stride)),
            int(math.ceil(image_shape[1] / stride))]
            for stride in config.BACKBONE_STRIDES])
```
可以看到上面代码，先看image_shape这个参数，这个其实是输入图像(resize后的)的尺寸，主要是行数列数(高度和宽度)，返回的是一个[N*(height,width)]的数组，N就是backbone网络的阶段数目(这里指的是提取出anchors的那一层featuremap，一层为一个阶段，这个可以从maskrcnn论文看出,因为mask rcnn就是从backbone网络的其中几个阶段的featuremap提取anchors的，论文设置是从5个阶段featuremap提取anchors的，这点从class CocoConfig(Config)的BACKBONE_STRIDES = [4, 8, 16, 32, 64]看到就是5个阶段),至于后面的image_shape[0] / stride等除法操作，其实就是计算该阶段featuremap的长宽,所以这里的BACKBONE_STRIDES = [4, 8, 16, 32, 64]其实就是表示图像到达这个阶段时尺寸缩小的倍数，比如是如下图像是512*512,那么经过第一个阶段就变成128*128,以此类推就是64*64，32*32，16*16，8*8， ,那么[N*(height,width)]数组返回的就是[ [128,128],[64,64]，[32,32]，[16,16]，[8,8] ]
那么生成[N*(height,width)]数组的c++部分代码如下：
```
float BACKBONE_STRIDES[5]={4, 8, 16, 32, 64};//用于计算输入图像经过backbone的每一个阶段(可能是pooling或者conv等down sample操作导致feature map缩小后的尺寸,这一部分没细看)后feature图的长宽

int backbone_strides_num =5;
int backbone_shape[5][2];//for backbone_shape
for(int i=0;i<backbone_strides_num;i++)
        {
            backbone_shape[i][0]=ceil(inputImg_h/BACKBONE_STRIDES[i]);
            backbone_shape[i][1]=ceil(inputImg_w/BACKBONE_STRIDES[i]);
        }
```

第二步,看utils.generate_pyramid_anchors这个函数做了什么操作,这个函数在matterport和pari的Mask_RCNN/mrcnn的utils文件都有，因为我们一开始安装的是matterport的mask rcnn，所以看他的就行了,这个函数代码如下

def generate_pyramid_anchors(scales, ratios, feature_shapes, feature_strides,
                             anchor_stride):
    """Generate anchors at different levels of a feature pyramid. Each scale
    is associated with a level of the pyramid, but each ratio is used in
    all levels of the pyramid.

    Returns:
    anchors: [N, (y1, x1, y2, x2)]. All generated anchors in one array. Sorted
        with the same order of the given scales. So, anchors of scale[0] come
        first, then anchors of scale[1], and so on.
    """
    # Anchors
    # [anchor_count, (y1, x1, y2, x2)]
    anchors = []
    for i in range(len(scales)):
        anchors.append(generate_anchors(scales[i], ratios, feature_shapes[i],
                                        feature_strides[i], anchor_stride))
    return np.concatenate(anchors, axis=0)

可以看出里面主要是generate_anchors函数起作用，那么我们就去看generate_anchors函数,这个函数也在utils文件里面,函数如下


def generate_anchors(scales, ratios, shape, feature_stride, anchor_stride):
    """
    scales: 1D array of anchor sizes in pixels. Example: [32, 64, 128]
    ratios: 1D array of anchor ratios of width/height. Example: [0.5, 1, 2]
    shape: [height, width] spatial shape of the feature map over which
            to generate anchors.
    feature_stride: Stride of the feature map relative to the image in pixels.
    anchor_stride: Stride of anchors on the feature map. For example, if the
        value is 2 then generate anchors for every other feature map pixel.
    """
    # Get all combinations of scales and ratios
    #对所有的尺度和缩放因子进行组合
    #配置文件中尺度和缩放因子如下
    #RPN_ANCHOR_SCALES = (32, 64, 128, 256, 512)
    #RPN_ANCHOR_RATIOS = [0.5, 1, 2]
    
    #可以去看下np.meshgrid(x1,x2)函数的操作,其实就是为了x1,x2的元素进行两两配对(仅仅是位置)而返回一个矩阵组合，假设外围循环传进来的是scales参数是RPN_ANCHOR_SCALES [0]
    
    scales, ratios = np.meshgrid(np.array(scales), np.array(ratios))
    '''

    
    经过np.meshgrid后，scales= array([32],
                                     [32],
                                     [32])
                      ratios = array( [0.5],
                                      [1],
                                      [ 2])
    '''

    '''
    对scales和ratios先进行平铺,例如操作后,scales=([32,32,32]) ，
    '''
    scales = scales.flatten()
    ratios = ratios.flatten()

    '''
    多个尺寸的anchors进行多个ratio的缩放，得到多个不同比例组合anchors的宽高
    '''
    # Enumerate heights and widths from scales and ratios
    heights = scales / np.sqrt(ratios)
    widths = scales * np.sqrt(ratios)

    '''
    计算得到多个不同偏移量y,x坐标，就是anchors中心点位置,shape[0],shape[1]是featuremap的宽
    高(注意是经过上面的以config.BACKBONE_STRIDES={4, 8, 16, 32, 64}为比例缩放后的)，        
    anchor_stride在配置中是1，就是anchors之间的间隔为1，    
    feature_stride就是config.BACKBONE_STRIDES={4, 8, 16, 32, 64}的元素遍历,乘以
    feature_stride相当于把np.arange(0, shape[0], anchor_stride)的元素映射回原图(我的是
    512*512)的坐标
    
    '''
    # Enumerate shifts in feature space
    shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
    shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride

    '''
    中心点位置进行两两组合就可以得到不同组合的bbox的中心点组合(仅仅是位置),真正配对在下面
    '''
    shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y)

    # Enumerate combinations of shifts, widths, and heights


    '''
    宽高分别与中心点的x,y坐标进行两两组合就可以得到不同组合的bbox的中心点组合
    这里只简单给出原理,结合meshgrid操作来看
    比如:由上述可知widths是一维数组,widths=[e,f,g] shifts_x=[[0,1],[2,3],[4,5]]
    那么经过np.meshgrid(widths, shifts_x)后，box_widths=[[e,f,g],          
                                                        [e,f,g],
                                                        [e,f,g],
                                                        [e,f,g],
                                                        [e,f,g],
                                                        [e,f,g] ]
    box_widths的shape就是6x3,          同理 box_center_x=[[0,0,0],          
                                                        [1,1,1],
                                                        [2,2,2],
                                                        [3,3,3],
                                                        [4,4,4],
                                                        [5,5,5] ]
    可以看出就是x与width两两配对，y与height两两配对(仅仅是位置)
    '''
    box_widths, box_centers_x = np.meshgrid(widths, shifts_x)
    box_heights, box_centers_y = np.meshgrid(heights, shifts_y)
    
    '''
    接下来是np.stack，np.stack用法可以参加官网或者    
    https://blog.csdn.net/wgx571859177/article/details/80987459，如果自己实践一下更能体会它
    的用法
    假设box_center_x还是上面说的形状,也就是6x3,box_center_y也是同样的形状(假设其元素是6，7，8，9，10，11)，那么经过下面的np.stack
    后，box_centers形状就是6x3x2,这里就是真正的配对了，要这样看，6x(3x2),那么box_centers的形式    
    如下
                                            box_centers=[
                                                        [[0,6],[0,6],[0,6]],          
                                                        [[1,7],[1,7],[1,7]],
                                                        [[2,8],[2,8],[2,8]],
                                                        [[3,9],[3,9],[3,9]],
                                                        [[4,10],[4,10],[4,10]],
                                                        [[5,11],[5,11],[5,11]] ]
    最后再进行reshape([-1,2])后box_centers=[[0,6],[0,6],[0,6],[1,7],...]的形式，形状是
    18x2，
    box_sizes同理，只不过里面元素是宽高，
    '''
    # Reshape to get a list of (y, x) and a list of (h, w)
    box_centers = np.stack(
        [box_centers_y, box_centers_x], axis=2).reshape([-1, 2])
    box_sizes = np.stack([box_heights, box_widths], axis=2).reshape([-1, 2])
    '''
    box_centers - 0.5 * box_sizes,box_centers - 0.5 * box_sizes就是得出左上右下角点的x,y坐
    标，进行np.concatenate(axis=1)后,boxes的形状就是18x4，4代表左上右下角点的x,y坐标，形式是
    18x(y1, x1, y2, x2),这里的18就是anchor的个数(这里是假设的，论文不是这个数值，得看配置然后
    经过上面的计算就知道个数了,只是为了方便演示)，(y1, x1, y2, x2)是不同形式的anchor，
    '''
    # Convert to corner coordinates (y1, x1, y2, x2)
    boxes = np.concatenate([box_centers - 0.5 * box_sizes,
                            box_centers + 0.5 * box_sizes], axis=1)
    return boxes

而这一段的c++代码如下:

int finalBoxesRows=0;//用于统计五个RPN_ANCHOR_SCALES尺度对应的所有boxes的行数,可以先不看这个

        //generate_pyramid_anchors //生成不同尺度(配置参数中是5个)的anchor
        for(int j=0;j<rpn_anchor_scales_num;j++)
        {
            //generate_anchors

            //Get all combinations of scales and ratios
            Eigen::RowVectorXf scalesVec(1);//遍历并且临时存储RPN_ANCHOR_SCALES[5]={32, 64, 128, 256, 512}的每个元素,主要给scalesMat赋值用
            Eigen::VectorXf ratiosVec(rpn_anchor_ratios_num);
            Eigen::MatrixXf scalesMat=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();
            Eigen::MatrixXf ratiosMat=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();
            Eigen::MatrixXf heightsMat;//=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();
            Eigen::MatrixXf widthsMat;//=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();


            //以下步骤主要是实现python中的
            /*
             scales, ratios = np.meshgrid(np.array(scales), np.array(ratios))
             */

            scalesVec(0)=(RPN_ANCHOR_SCALES[j]);

            //构造np.array(ratios)
            for(int i=0;i<rpn_anchor_ratios_num;i++)
            {
                ratiosVec(i)=RPN_ANCHOR_RATIOS[i];
            }
            for(int i=0;i<ratiosMat.cols();i++)
            {
                ratiosMat.col(i)<<ratiosVec;
            }


            //构造np.array(scales)
            //std::cout<<"scalesMat is <<"<<scalesMat.cols()<<std::endl;
            for(int i=0;i<scalesMat.rows();i++)
            {
                scalesMat.row(i)<<scalesVec;

            }


            //构造heights,widths,这两个在python里面是长度为3的向量,但为了后面的点乘等操作换成了3*1的矩阵
            //python代码如下
            /*
                heights = scales / np.sqrt(ratios)
                widths = scales * np.sqrt(ratios)
             */

            //Enumerate heights and widths from scales and ratios
            heightsMat=scalesMat.cwiseQuotient(ratiosMat.cwiseSqrt());
            widthsMat=scalesMat.cwiseProduct(ratiosMat.cwiseSqrt());

            //构造shifts_x, shifts_y
            //python代码如下
            /*
            shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
            shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride
            shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y)
             */
            //Enumerate shifts in feature space
            //先进行   shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
            //        shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride

            int step=RPN_ANCHOR_STRIDE,low=0,hight_y=backbone_shape[j][0],hight_x=backbone_shape[j][1];//获取shape[0],shape[1],anchor_stride,
            Eigen::RowVectorXf shifts_y;//行向量
            Eigen::RowVectorXf shifts_x;
            int realsize_y=((hight_y-low)/step);
            int realsize_x=((hight_x-low)/step);
            shifts_y.setLinSpaced(realsize_y,low,low+step*(realsize_y-1));
            shifts_x.setLinSpaced(realsize_x,low,low+step*(realsize_x-1));
            shifts_y*=BACKBONE_STRIDES[j];//获取feature_stride,这里的feature_stride其实是python代码中外围循环送进的参数BACKBONE_STRIDES[j]
            shifts_x*=BACKBONE_STRIDES[j];//获取feature_stride,这里的feature_stride其实是python代码中外围循环送进的参数BACKBONE_STRIDES[j]

            /*再进行   shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y),
            构造出最终的shifts_x,shifts_y矩阵,注意经过np.meshgrid后shifts_x,shifts_y是二维的矩阵
            */
            //构造shifts_x,shifts_y矩阵
            Eigen::MatrixXf shifts_xMat(shifts_y.cols(),shifts_x.cols())
                    ,shifts_yMat(shifts_y.cols(),shifts_x.cols());
            for(int i=0;i<shifts_xMat.rows();i++)
            {
                shifts_xMat.row(i)=shifts_x;

            }
            for(int i=0;i<shifts_yMat.cols();i++)
            {
                shifts_yMat.col(i)=shifts_y;
            }





            //进行python代码
            /*
                box_widths, box_centers_x = np.meshgrid(widths, shifts_x)
                box_heights, box_centers_y = np.meshgrid(heights, shifts_y)

                # Reshape to get a list of (y, x) and a list of (h, w)
                box_centers = np.stack([box_centers_y, box_centers_x], axis=2).reshape([-1, 2])
                box_sizes = np.stack([box_heights, box_widths], axis=2).reshape([-1, 2])

                # Convert to corner coordinates (y1, x1, y2, x2)
                boxes = np.concatenate([box_centers - 0.5 * box_sizes,
                            box_centers + 0.5 * box_sizes], axis=1)
                return boxes

             */
            //Enumerate combinations of shifts, widths, and heights
            //先进行 box_widths, box_centers_x = np.meshgrid(widths, shifts_x)
            //      box_heights, box_centers_y = np.meshgrid(heights, shifts_y)
            //先把heightsMat,widthsMat换成行向量方便赋值,
            Eigen::RowVectorXf heightsMatFlat(Eigen::Map<Eigen::VectorXf>(heightsMat.data(),heightsMat.rows()*heightsMat.cols()));
            Eigen::RowVectorXf widthsMatFlat(Eigen::Map<Eigen::VectorXf>(widthsMat.data(),widthsMat.rows()*widthsMat.cols()));

            /*因为上面的np.meshgrid(widths, shifts_x)
            中widths是长度为3的向量,shifts_x是二维矩阵,所以np.meshgrid(widths, shifts_x)生成的矩阵列数是widths的长度
            生成的矩阵行数是--shifts_x按照行方向平铺后的长度,假如shifts_x是2*3矩阵,那么就是6.而后面
            box_widths, box_centers_x = np.meshgrid(widths, shifts_x)生成的box_centers_x的行数是shifts_x的行数*列数,box_centers_x每一列是shifts_x矩阵的元素按照行方向平铺后构成的,
            但是因为eigen里面的矩阵是列优先存储,所以要在c++代码中对shifts_xMat(shift_x)进行转置,这样通过Eigen::Map映射到shifts_yMatFlat就是相当于把shifts_x矩阵的元素按照行方向平铺后构成的向量
            同理对shifts_yMat进行同样的操作得到shifts_yMatFlat.
            而box_widths,box_heights可以通过widthsMatFlat,和heightsMatFlat赋值得到,因为heightsMatFlat和box_heights可以通过widthsMatFlat
            本身是一维的向量
            */
            shifts_xMat.transposeInPlace();
            shifts_yMat.transposeInPlace();
            Eigen::RowVectorXf shifts_yMatFlat(Eigen::Map<Eigen::VectorXf>(shifts_yMat.data(),shifts_yMat.rows()*shifts_yMat.cols()));
            //Eigen::RowVectorXf shifts_xMatFlat(Eigen::Map<Eigen::VectorXf>(shifts_xMat.data(),shifts_xMat.rows()*shifts_xMat.cols(),Eigen::ColMajor));
            Eigen::RowVectorXf shifts_xMatFlat(Eigen::Map<Eigen::VectorXf>(shifts_xMat.data(),shifts_xMat.rows()*shifts_xMat.cols()));
            Eigen::MatrixXf box_widthsMat=Eigen::MatrixXf(shifts_xMatFlat.cols(),widthsMatFlat.cols());//();
            Eigen::MatrixXf box_center_xMat=Eigen::MatrixXf(shifts_xMatFlat.cols(),widthsMatFlat.cols());//();
            Eigen::MatrixXf box_heightsMat=Eigen::MatrixXf(shifts_yMatFlat.cols(),heightsMatFlat.cols());//();
            Eigen::MatrixXf box_center_yMat=Eigen::MatrixXf(shifts_yMatFlat.cols(),heightsMatFlat.cols());//();
            for(int i=0;i<box_widthsMat.rows();i++)
            {
                box_widthsMat.row(i)=widthsMatFlat;
                box_heightsMat.row(i)=heightsMatFlat;
            }
            for(int i=0;i<box_heightsMat.cols();i++)
            {
                box_center_xMat.col(i)=shifts_xMatFlat;
                box_center_yMat.col(i)=shifts_yMatFlat;
            }


            //Convert to corner coordinates (y1, x1, y2, x2)
            // 'e for 's element abbreviation
            //note that ,in the bellow,matrix's element which to be add or substract, is In the corresponding position
            //python method: box_centers_y mat ,box_centers_x mat  stack to  mat A whose unit format is (box_center_y'e,box_center_x'e)
            //then reshape to [-1,2],so the result is mat whose  col format is (box_center_y'e,box_center_x'e),box_sizes mat B is the same,col format is (box_height'e,box_width'e)
            //then  A-B ,A+B get the mat C,D whose col format are  respectively  (box_center_y'e-box_height'e,box_center_x'e-box_width'e) and (box_center_y'e+box_height'e,box_center_x'e+box_width'e)
            //then concat C and D get mat E whose col format is (box_center_y'e-box_height'e,box_center_x'e-box_width'e ,box_center_y'e+box_height'e,box_center_x'e+box_width'e)
            //and that is (y1,x1,y2,x2)
            //in eigen3,different to python
            //first we have got the matrix box_center_yMat box_center_xMat box_heightsMat box_widthsMat
            //for abbreviation is center_yMat,center_xMat,heightMat,widthMat
            //center_yMat-0.5*heightMat=y1Mat
            //center_yMat+0.5*heightMat=y2Mat
            //center_xMat-0.5*widthMat=x1Mat
            //center_xMat+0.5*widthMat=x2Mat
            //then generate the matrix boxes whose col format is (y1Mat's e,x1Mat's e,y2Mat's e ,x2Mat's e),rows in the num

            //进行如下操作
            //boxes = np.concatenate([box_centers - 0.5 * box_sizes,
            //box_centers + 0.5 * box_sizes], axis=1)
            //boxes形式如[(y1, x1, y2, x2),...,...]
            Eigen::MatrixXf y1Mat=box_center_yMat-box_heightsMat*0.5;
            Eigen::MatrixXf x1Mat=box_center_xMat-box_widthsMat*0.5;
            Eigen::MatrixXf y2Mat=box_center_yMat+box_heightsMat*0.5;
            Eigen::MatrixXf x2Mat=box_center_xMat+box_widthsMat*0.5;
            y1Mat.transposeInPlace();
            x1Mat.transposeInPlace();
            y2Mat.transposeInPlace();
            x2Mat.transposeInPlace();
            Eigen::RowVectorXf y1MatFlat(Eigen::Map<Eigen::VectorXf>(y1Mat.data(),y1Mat.rows()*y1Mat.cols()));
            Eigen::RowVectorXf x1MatFlat(Eigen::Map<Eigen::VectorXf>(x1Mat.data(),x1Mat.rows()*x1Mat.cols()));
            Eigen::RowVectorXf y2MatFlat(Eigen::Map<Eigen::VectorXf>(y2Mat.data(),y2Mat.rows()*y2Mat.cols()));
            Eigen::RowVectorXf x2MatFlat(Eigen::Map<Eigen::VectorXf>(x2Mat.data(),x2Mat.rows()*x2Mat.cols()));
            Eigen::MatrixXf boxes(y1Mat.rows()*y1Mat.cols(),4);//注意这里的boxes不是python代码里面对应的boxes
            boxes.col(0)=y1MatFlat;
            boxes.col(1)=x1MatFlat;
            boxes.col(2)=y2MatFlat;
            boxes.col(3)=x2MatFlat;
            //到此已经完成单独一个RPN_ANCHOR_SCALES[i]尺度对应的boxes了
            //把它放进容器里
            
            boxesVec.push_back(boxes);
            finalBoxesRows+=boxes.rows();//统计五个RPN_ANCHOR_SCALES尺度对应的所有boxes的行数
            //break;
        }
        //以上一步得到的boxes的finalBoxesRows为行数,4为列数创建二维矩阵finalBox(对应python代码的boxes),
        //其实就是用上面所有的boxes构建形式如[(y1, x1, y2, x2),...,...]的矩阵
        finalBox=Eigen::MatrixXf (finalBoxesRows,4);
        //Eigen::VectorXf a(3);
        //Eigen::VectorXf b(4);
        //Eigen::VectorXf c(7);
        //取出boxesVec容器里面每个boxes构建最终的finalBox矩阵(对应boxes)
        //至此完成了boxes(python代码中)的构建
        int beginX=0;
        for(int i=0;i<boxesVec.size();i++)
        {
             //mat1.block<rows,cols>(i,j)
            //矩阵块赋值
            finalBox.block(beginX,0,boxesVec[i].rows(),boxesVec[i].cols())=boxesVec[i];
            beginX+=boxesVec[i].rows();
            //tensorflow::Tensor matTensor(tensorflow::DT_FLOAT,{boxesVec[i].rows(),boxesVec[i].cols()});
        }

第三步,进行归一化

    # Normalize coordinates
      _anchor_cache[tuple(image_shape)] = utils.norm_boxes(a, image_shape[:2])
    return _anchor_cache[tuple(image_shape)]

#python代码
'''
大概作用就是把上一步得到的boxes(形状如[(y1,x1,y2,x2),...,...])先进行对应位置元素减去偏移量[0,0,1,1]后再除以[h - 1, w - 1, h - 1, w - 1]
'''
def norm_boxes(boxes, shape):
    """Converts boxes from pixel coordinates to normalized coordinates.
    boxes: [N, (y1, x1, y2, x2)] in pixel coordinates
    shape: [..., (height, width)] in pixels
    Note: In pixel coordinates (y2, x2) is outside the box. But in normalized
    coordinates it's inside the box.
    Returns:
        [N, (y1, x1, y2, x2)] in normalized coordinates
    """
    h, w = shape
    scale = np.array([h - 1, w - 1, h - 1, w - 1])
    shift = np.array([0, 0, 1, 1])
    return np.divide((boxes - shift), scale).astype(np.float32)

c++代码如下

/*get normalization finalbox
        归一化finalBox
        python代码如下:
        scale = np.array([h - 1, w - 1, h - 1, w - 1])
        shift = np.array([0, 0, 1, 1])
        return np.divide((boxes - shift), scale).astype(np.float32)
        */

        //先创建scale,shift两个向量
        Eigen::MatrixXf scaleMat_1r(1,finalBox.cols());
        Eigen::MatrixXf shiftMat_1r(1,finalBox.cols());
        scaleMat_1r<<float(inputImg_h-1),float(inputImg_w-1),float(inputImg_h-1),float(inputImg_w-1);
        shiftMat_1r<<0.f,0.f,1.f,1.f;
        //因为上一步得到是scaleMat_1r,shiftMat_1r是向量,接下来创建对应的矩阵,该矩阵与finalBox有相同的
        //形状
        Eigen::MatrixXf scaleMat=scaleMat_1r.colwise().replicate(finalBox.rows());//通过重复与finalBox同样的行数构建scaleMat
        Eigen::MatrixXf shiftMat=shiftMat_1r.colwise().replicate(finalBox.rows());//同上
        Eigen::MatrixXf tmpMat=finalBox-shiftMat;//finalBox对应位置元素减去偏移量
        finalBox_norm=tmpMat.cwiseQuotient(scaleMat);//finalBox对应位置元素处以scale
        //至此完成了python代码中的boxes(finalBox_norm),下一步把finalBox_norm矩阵弄成Eigen::tensor类型的inputAnchorsTensor_temp
        //再通过inputAnchorsTensor_temp填充到tensorflow::tensor类型的inputAnchorsTensor构建最后送入模型的anchor boxes


        inputAnchorsTensor=tensorflow::Tensor(tensorflow::DT_FLOAT,{batch_size,finalBox_norm.rows(),finalBox_norm.cols()});//初始化inputAnchorsTensor
        //float *p=inputAnchorsTensor.flat<float>().data();
        //通finalBox_norm矩阵构建Eigen::tensor类型的inputAnchorsTensor_temp
        Eigen::Tensor<float,3>inputAnchorsTensor_temp(1,finalBox_norm.rows(),finalBox_norm.cols());
        for(int i=0;i<finalBox_norm.rows();i++){

            Eigen::Tensor<float,1>eachrow(finalBox_norm.cols());//用于临时存储finalBox_norm矩阵的的每一行
            //把finalBox_norm矩阵的一行放进eachrow
            eachrow.setValues({finalBox_norm.row(i)[0],finalBox_norm.row(i)[1],finalBox_norm.row(i)[2],finalBox_norm.row(i)[3]});
            //把eachrow放进inputAnchorsTensor_temp的每一行
            inputAnchorsTensor_temp.chip(i,1)=eachrow;
        }
        //把inputAnchorsTensor_temp赋值给inputAnchorsTensor,注意它们两个的类型是不同的
        auto showMap=inputAnchorsTensor.tensor<float,3>();
        for(int b=0;b<showMap.dimension(0);b++)
        {
            for(int r=0;r<showMap.dimension(1);r++)
            {
                for(int c=0;c<showMap.dimension(2);c++)
                {
                    
                    showMap(b,r,c)=inputAnchorsTensor_temp(0,r,c);//这里为0是因为
                    //我的batch里面的图片都是同样尺寸的,所以它们最终的anchor boxes都是一样,
                    //只要赋值一个就行了,建议batch里面图片尺寸都是一样的,这样好处理
                }
            }
        }

至此我们已经完成所有送进模型的tensor了，那么网络进行前向推理后,得到的结果也是以tensor的形式保存,怎么提取我们想要的结果能,可以先看下parai/Mask_RCNN的python infere_from_pb.py代码

        detections, mrcnn_class, mrcnn_bbox, mrcnn_mask, rois = \
            sess.run([detectionsT, mrcnn_classT, mrcnn_bboxT, mrcnn_maskT, roisT],
                feed_dict={img_ph: molded_images, img_meta_ph: image_metas,             
            img_anchors_ph:image_anchors})
        //上面是模型推理的代码
        //目标的坐标，类型和置信度结果存储在detections,分割结果存储在mrcnn_mask里面
        //我们只要看unmold_detections函数怎么实现就行了
        results = []
        for i, image in enumerate(images):
            final_rois, final_class_ids, final_scores, final_masks =\
                unmold_detections(detections[i], mrcnn_mask[i],
                                  image.shape, molded_images[i].shape,
                                  windows[i])
            results.append({
                "rois": final_rois,
                "class_ids": final_class_ids,
                "scores": final_scores,
                "masks": final_masks,
            })

可以看到里面起主要作用的是unmold_detections(...)这个函数,这个函数在infere_from_pb.py和mrcnn/model.py都有,是一样的。
接下来看这个函数的python代码:如下

def unmold_detections(detections, mrcnn_mask, original_image_shape, image_shape, window):
    """Reformats the detections of one image from the format of the neural
        network output to a format suitable for use in the rest of the
        application.

        detections: [N, (y1, x1, y2, x2, class_id, score)] in normalized coordinates
        这个是网络输出的目标框结果,N是检测到目标的个数,(y1, x1, y2, x2, class_id, score)分别        
        是四个坐标+类别id+分数,形式是N个(y1, x1, y2, x2, class_id, score)构成的数组

        mrcnn_mask: [N, height, width, num_classes]
        这个是网络输出的分割结果,

        original_image_shape: [H, W, C] Original image shape before resizing
        原始图像尺寸

        image_shape: [H, W, C] Shape of the image after resizing and padding
        这个是送入网络的图像的尺寸

        window: [y1, x1, y2, x2] Pixel coordinates of box in the image where the real
                image is excluding the padding.
        这个是显示窗口的尺寸
        

        Returns:
        返回目标框+每个目标框对应的类别+每个目标框对应的分数+每个目标对应的分割
        boxes: [N, (y1, x1, y2, x2)] Bounding boxes in pixels
        class_ids: [N] Integer class IDs for each bounding box
        scores: [N] Float probability scores of the class_id
        masks: [height, width, num_instances] Instance masks

        """

                


    # How many detections do we have?
    # Detections array is padded with zeros. Find the first class_id == 0.
    # 获取类别为0的索引,因为0是背景
    zero_ix = np.where(detections[:, 4] == 0)[0]#这里的[0]是因为np.where的操作结果是放在元组    
    的第一个元素里面的

    获取第一个为0的元素索引
    N = zero_ix[0] if zero_ix.shape[0] > 0 else detections.shape[0]



    #提取第N个元素前的所有元素，也就是那些类别不为0的检测框，这一步可能是网络输出时,detections数
    #组是按类别从大到0排列的，所以当取得第一个为0的元素的索引，该索引前面都是类别非0的检测框,
    #不知道是不是这样，这一步没有深究，c++代码里面可以直接遍历detections数组每个元素判断类别是否
    #为0来取舍
    # Extract boxes, class_ids, scores, and class-specific masks
    boxes = detections[:N, :4]
    class_ids = detections[:N, 4].astype(np.int32)
    scores = detections[:N, 5]
    masks = mrcnn_mask[np.arange(N), :, :, class_ids]


    #进行归一化,因为我是裁剪好图片送进网络的,所以这一步可以不做
    # Translate normalized coordinates in the resized image to pixel
    # coordinates in the original image before resizing
    window = utils.norm_boxes(window, image_shape[:2])
    wy1, wx1, wy2, wx2 = window
    shift = np.array([wy1, wx1, wy1, wx1])
    wh = wy2 - wy1  # window height
    ww = wx2 - wx1  # window width
    scale = np.array([wh, ww, wh, ww])
    # Convert boxes to normalized coordinates on the window
    boxes = np.divide(boxes - shift, scale)
    # Convert boxes to pixel coordinates on the original image
    boxes = utils.denorm_boxes(boxes, original_image_shape[:2])

    # Filter out detections with zero area. Happens in early training when
    # network weights are still random
    #从boxes里面找出宽高小于0的索引
    exclude_ix = np.where(
        (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1]) <= 0)[0]
    
    #如果宽高小于0的索引个数不为0,就从boxe里面删除这些宽高小于0的索引
    if exclude_ix.shape[0] > 0:
        boxes = np.delete(boxes, exclude_ix, axis=0)
        class_ids = np.delete(class_ids, exclude_ix, axis=0)
        scores = np.delete(scores, exclude_ix, axis=0)
        masks = np.delete(masks, exclude_ix, axis=0)
        N = class_ids.shape[0]
    
    #

    #经过上一步的处理已经获取到类别不为0(不是背景)且尺寸不小于0的索引
    #下一步计算这些索引对应的mask,因为我只要目标框，所以c++代码中我没有计算mask，想计算mask
    #可以根据下面的python代码用c++实现相同的效果就行了
    # Resize masks to original image size and set boundary threshold.
    full_masks = []
    for i in range(N):
        # Convert neural network mask to full size mask
        full_mask = utils.unmold_mask(masks[i], boxes[i], original_image_shape)
        full_masks.append(full_mask)
    full_masks = np.stack(full_masks, axis=-1)\
        if full_masks else np.empty(masks.shape[1:3] + (0,))

    return boxes, class_ids, scores, full_masks

里面两个主要函数是utils.norm_boxes和utils.denorm_boxes ，如下:

def norm_boxes(boxes, shape):
    """Converts boxes from pixel coordinates to normalized coordinates.
    boxes: [N, (y1, x1, y2, x2)] in pixel coordinates #boxes数据格式
    shape: [..., (height, width)] in pixels #图像尺寸,我这边是固定好的

    Note: In pixel coordinates (y2, x2) is outside the box. But in normalized
    coordinates it's inside the box.

    Returns:
        [N, (y1, x1, y2, x2)] in normalized coordinates
    """
    h, w = shape
    scale = np.array([h - 1, w - 1, h - 1, w - 1])
    shift = np.array([0, 0, 1, 1])
    return np.divide((boxes - shift), scale).astype(np.float32)


def denorm_boxes(boxes, shape):
    """Converts boxes from normalized coordinates to pixel coordinates.
    boxes: [N, (y1, x1, y2, x2)] in normalized coordinates #boxes数据格式
    shape: [..., (height, width)] in pixels  #图像尺寸,我这边是固定好的

    Note: In pixel coordinates (y2, x2) is outside the box. But in normalized
    coordinates it's inside the box.

    Returns:
        [N, (y1, x1, y2, x2)] in pixel coordinates
    """
    h, w = shape
    scale = np.array([h - 1, w - 1, h - 1, w - 1])
    shift = np.array([0, 0, 1, 1])
    return np.around(np.multiply(boxes, scale) + shift).astype(np.int32)

这一部分c++代码:

struct boxInfo{
    int y1,x1,y2,x2;
    int classId=0;
    float scores=0.f;
    int boxNum=-1;
};

struct imageDetectInfo{
    int imageWidth=0;//not yet
    int imageHeight=0;//not yet
    int imageNum=-1;
    std::vector<boxInfo> detectInfo;

};

//std::vector<tensorflow::Tensor>&output_tensors 网络输出最终的结果
//std::vector<imageDetectInfo> &output_vec 用于存储最终的结果,这是本人的格式，可以修改为你自己想要的格式
void detectBatch::unmold_detections(std::vector<tensorflow::Tensor>&output_tensors,
std::vector<imageDetectInfo> &output_vec)
{

    //获取网络输出的检测框结果,为output_tensors容器的第一个元素
    tensorflow::Tensor &detections_tensor=output_tensors[0];

    //获取detections_tensor的eigen型tensor,boxes_tensor和detections_tensor是指向同个内存的

    auto  boxes_tensor=detections_tensor.tensor<float,3>();
    //Extract boxes, class_ids, scores, and class-specific masks
    //whose classId in not 0 ,because 0 is background
    //std::cout<<"resized_tensor is "<<resized_tensor.shape()<<std::endl;
    //std::cout<<"inputAnchorsTensor is "<<inputAnchorsTensor.shape()<<std::endl;
    //std::cout<<"inputMetadataTensor is "<<inputMetadataTensor.shape()<<std::endl;
    //std::cout<<"detections_tensor is "<<detections_tensor.shape()<<std::endl;

    //boxes_tensor和detections_tensor的格式是[N, (y1, x1, y2, x2, class_id, score)] 
    //遍历检测框,首先boxes_tensor的第一个维度是图片数
    //遍历一个batch
    for(int imgNum=0;imgNum<boxes_tensor.dimension(0);imgNum++)
    {
        std::vector<Eigen::RowVectorXf> noZeroRow;//声明一个行向量容器存放非背景检测框的数据

        //struct imageDetectInfo imageDetectInfotmp ;
        //遍历第二个维度,即每张图片的检测框的数目
        for(int boxNum=0;boxNum<boxes_tensor.dimension(1);boxNum++)
        {
            
            //判断类别是否大于0,大于0表示该检测框不是背景
           if (boxes_tensor(imgNum,boxNum,4)>0)
           {
                
               //把检测框的数据放进行向量
               Eigen::RowVectorXf eachrow(boxes_tensor.dimension(2));
               eachrow<<boxes_tensor(imgNum,boxNum,0),
                       boxes_tensor(imgNum,boxNum,1),
                       boxes_tensor(imgNum,boxNum,2),
                       boxes_tensor(imgNum,boxNum,3),
                       boxes_tensor(imgNum,boxNum,4),
                       boxes_tensor(imgNum,boxNum,5);
                noZeroRow.push_back(eachrow);

           }

        }
        
        //创建一个noZeroMat矩阵来存放上面的noZeroRow向量
        Eigen::MatrixXf noZeroMat (noZeroRow.size(),6);
        for(int r=0;r<noZeroRow.size();r++)
        {
            noZeroMat.row(r)=noZeroRow[r];

        }

        //上面的noZeroMat矩阵存的不是最终的检测框数据,还要逆归一化,裁剪等操作
        Eigen::MatrixXf boxMat(noZeroMat.rows(),4);
        Eigen::MatrixXf classSoresMat(noZeroMat.rows(),2);
            
        //noZeroMat的格式是[(y1, x1, y2, x2, class_id, score),...,...] 
        //获取noZeroMat的(y1, x1, y2, x2)部分, 
        boxMat.block(0,0,boxMat.rows(),4)=noZeroMat.block(0,0,noZeroMat.rows(),4);

        //获取noZeroMat的(class_id, score)部分,             classSoresMat.block(0,0,classSoresMat.rows(),2)=noZeroMat.block(0,4,classSoresMat.rows(),2);
        //std::cout<<"noZeroMat "<<noZeroMat<<std::endl;
        //std::cout<<"boxMat "<<boxMat<<std::endl;

        
        //get the window in image meta
        //获取图像的meta数据,这个是之前计算好了的
        auto metaTensor=inputMetadataTensor.tensor<float,2>();

        '''这一部分是模拟 python的norm_boxes(boxes, shape):
        //根据显示窗口来获取对应的windowMat和scale_rMat缩放矩阵,这一部分操作其实可以不用,因为图
        //像都是裁剪好的,而且我也不用在窗口上显示
        Eigen::MatrixXf windowMat(1,4);
        Eigen::MatrixXf scale_rMat(1,4);
        windowMat<<metaTensor(0,7),metaTensor(0,8),
                metaTensor(0,7),metaTensor(0,8);
        scale_rMat<<metaTensor(0,9)-metaTensor(0,7),
                    metaTensor(0,10)-metaTensor(0,8),
                    metaTensor(0,9)-metaTensor(0,7),
                    metaTensor(0,10)-metaTensor(0,8);
        //get shiftmat
        //boxMat=tmpMat.cwiseQuotient(scaleMat);//that is unnecessary
        //because in my case ,shiftmat is [0,0,0,0],scale is [1,1,1,1]




        //逆归一化
        //denorm_boxes //模拟 python的denorm_boxes(boxes, shape)
        Eigen::MatrixXf shiftNorm_rMat(1,4);//空的偏移矩阵
        Eigen::MatrixXf scaleNorm_rMat(1,4);//空的尺度缩放矩阵
        shiftNorm_rMat<<0,0,1,1;//矩阵赋值,其实shiftNorm_rMat现在是一个行向量
        scaleNorm_rMat<<metaTensor(0,1)-1,//同上是一个行向量
                    metaTensor(0,2)-1,
                    metaTensor(0,1)-1,
                    metaTensor(0,2)-1;
        //在列方向对shiftNorm_rMat和scaleNorm_rMat向量进行复制,行数与boxMat相同
        Eigen::MatrixXf shiftNormMat=shiftNorm_rMat.colwise().replicate(boxMat.rows());
        Eigen::MatrixXf scaleNormMat=scaleNorm_rMat.colwise().replicate(boxMat.rows());
        boxMat=boxMat.cwiseProduct(scaleNormMat);//矩阵点乘
        boxMat=boxMat+shiftNormMat;
        finalboxMat=boxMat;//返回最终的boxMat




        //std::cout<<"final box mat is "<<finalboxMat<<std::endl;
        //将最终的数据放进自己的构建的结构化数据里面,可以根据的自己的需要修改
        struct imageDetectInfo imageDetectInfoTmp;
        for(int i=0;i<finalboxMat.rows();i++)
        {
            struct boxInfo boxInfoTmp;
            boxInfoTmp.y1=(int)(finalboxMat(i,0));
            boxInfoTmp.x1=(int)(finalboxMat(i,1));
            boxInfoTmp.y2=(int)(finalboxMat(i,2));
            boxInfoTmp.x2=(int)(finalboxMat(i,3));
            boxInfoTmp.classId=(int)(classSoresMat(i,0));
            boxInfoTmp.scores=classSoresMat(i,1);
            boxInfoTmp.boxNum=i;
            imageDetectInfoTmp.detectInfo.push_back(boxInfoTmp);
        }
        imageDetectInfoTmp.imageNum=imgNum;
        output_vec[imgNum]=imageDetectInfoTmp;

        //outputsInfo.push_back(imageDetectInfoTmp);

    }
}

至此就完成所有的操作了，放上一波检测的结果(界面为qt做的)，模型不是训练得不太完善，主要是
数据增强和网络参数还没完善

至此整个流程算是走完了,
想下载源码的兄弟可以下载这个链接:
没有积分的话可以联系微信号anuntilforever1314 ,加好友时请备注行业+姓名以便交流 -。-与大家一起交流学习。

另外车辆，车牌，反光衣，安全帽等数据集，链接，有兴趣的朋友可以看下

链接：https://pan.baidu.com/s/1mG7X71rngtWqP2tsfFm26A 提取码：5555

代码写的稀烂，论文也不是读得很仔细，如果有大佬们发现有问题的地方请不吝赐教。

总结:虽然mask rcnn是几年前的东西,最近也有好多目标检测，分割的新论文出现，但实践中在多个数据集中mask rcnn效果还是可以的,个人觉得原因之一就是其中密集的anchor,不过未来应该是anchor free的，下一步进行anchor free的研究....

下一步工作: ssd pytorch 转 torchscript 再用 libtorch c++ 调用

GitHub - CasonTsai/MaskRcnn_tensorflow_cpp_inference: inference mask_rcnn model with tensorflow c++ api

补上qt.pro文件

#-------------------------------------------------
#
# Project created by QtCreator 2018-12-18T13:01:09
#
#-------------------------------------------------

QT       += core gui

greaterThan(QT_MAJOR_VERSION, 4): QT += widgets

TARGET = codeShow
TEMPLATE = app

qtHaveModule(opengl): QT += opengl

# The following define makes your compiler emit warnings if you use
# any feature of Qt which has been marked as deprecated (the exact warnings
# depend on your compiler). Please consult the documentation of the
# deprecated API in order to know how to port your code away from it.
#DEFINES += QT_DEPRECATED_WARNINGS

# You can also make your code fail to compile if you use deprecated APIs.
# In order to do so, uncomment the following line.
# You can also select to disable deprecated APIs only up to a certain version of Qt.
#DEFINES += QT_DISABLE_DEPRECATED_BEFORE=0x060000    # disables all the APIs deprecated before Qt 6.0.0
DEFINES +=   COMPILER_MSVC NOMINMAX COMPILER_MSVC QT_DEPRECATED_WARNINGS
CONFIG += c++11 thread

SOURCES += \
        main.cpp \
        mainwindow.cpp \
        detectbatch.cpp



HEADERS += \
        mainwindow.h \
        data_format.h \
        detectbatch.h


FORMS += \
        mainwindow.ui


# Default rules for deployment.
#qnx: target.path = /tmp/$${TARGET}/bin
#else: unix:!android: target.path = /opt/$${TARGET}/bin
#!isEmpty(target.path): INSTALLS += target


##cuda
#CUDA_DIR = "E:\thirdParty_lib\cuda\install"                # Path to cuda toolkit install
#SYSTEM_NAME = x64                 # Depending on your system either 'Win32', 'x64', or 'Win64'
#SYSTEM_TYPE = 64                    # '32' or '64', depending on your system
#CUDA_ARCH = compute_61                 # Type of CUDA architecture
#CUDA_CODE = sm_61
#NVCC_OPTIONS = --use_fast_math
## include paths
#INCLUDEPATH += "$$CUDA_DIR/include" \
#"D:\software\cuda_install\common\inc"
## library directories
#QMAKE_LIBDIR += "$$CUDA_DIR/lib/x64"
## The following makes sure all path names (which often include spaces) are put between quotation marks
#CUDA_INC = $$join(INCLUDEPATH,'" -I"','-I"','"')
## Add the necessary libraries
#CUDA_LIB_NAMES += \
#cuda \
#cudart \
#MSVCRT
##CUDA_LIB_NAMES += \
##cublas \
##cublas_device \
##cuda \
##cudadevrt \
##cudart \
##cudart_static \
##cufft \
##cufftw \
##curand \
##cusolver \
##cusparse \
##nppc \
##nppial \
##nppicc \
##nppicom \
##nppidei \
##nppif \
##nppig \
##nppim \
##nppist \
##nppisu \
##nppitc \
##npps \
##nvblas \
##nvcuvid \
##nvgraph \
##nvml \
##nvrtc \
##OpenCL \
##kernel32 \
##user32 \
##gdi32 \
##winspool \
##comdlg32 \
##advapi32 \
##shell32 \
##ole32 \
##oleaut32 \
##uuid \
##odbc32 \
##odbccp32 \
##ucrt \
##MSVCRT
#for(lib, CUDA_LIB_NAMES) {
#    CUDA_LIBS += $$lib.lib
#}
#for(lib, CUDA_LIB_NAMES) {
#    NVCC_LIBS += -l$$lib
#}
#LIBS += $$NVCC_LIBS
## The following library conflicts with something in Cuda
#QMAKE_LFLAGS_RELEASE = /NODEFAULTLIB:msvcrt.lib
#QMAKE_LFLAGS_DEBUG   = /NODEFAULTLIB:msvcrtd.lib
## MSVCRT link option (static or dynamic, it must be the same with your Qt SDK link option)
#MSVCRT_LINK_FLAG_DEBUG   = "/MDd"
#MSVCRT_LINK_FLAG_RELEASE = "/MD"
##MSVCRT_LINK_FLAG_DEBUG   = "/MTd"
##MSVCRT_LINK_FLAG_RELEASE = "/MT"
## Configuration of the Cuda compiler
#CONFIG(debug, debug|release) {
#    # Debug mode
#    DESTDIR = debug
#    OBJECTS_DIR = debug/obj
#    CUDA_OBJECTS_DIR = debug/cuda
#    cuda_d.input = CUDA_SOURCES
#    cuda_d.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
#    cuda_d.commands = $$CUDA_DIR/bin/nvcc.exe -D_DEBUG $$NVCC_OPTIONS $$CUDA_INC $$LIBS \
#                      --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -code=$$CUDA_CODE \
#                      --compile -cudart static -g -DWIN32 -D_MBCS \
#                      -Xcompiler "/wd4819,/EHsc,/W3,/nologo,/Od,/Zi,/RTC1" \
#                      -Xcompiler $$MSVCRT_LINK_FLAG_DEBUG \
#                      -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
#    cuda_d.dependency_type = TYPE_C
#    QMAKE_EXTRA_COMPILERS += cuda_d
#}
#else {
#    # Release mode
#    DESTDIR = release
#    OBJECTS_DIR = release/obj
#    CUDA_OBJECTS_DIR = release/cuda
#    cuda.input = CUDA_SOURCES
#    cuda.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
#    cuda.commands = $$CUDA_DIR/bin/nvcc.exe $$NVCC_OPTIONS $$CUDA_INC $$LIBS \
#                    --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -code=$$CUDA_CODE \
#                    --compile -cudart static -D_MBCS \
#                    -Xcompiler "/wd4819,/EHsc,/W3,/nologo,/O2,/Zi" \
#                    -Xcompiler $$MSVCRT_LINK_FLAG_RELEASE \
#                    -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
#    cuda.dependency_type = TYPE_C
#    QMAKE_EXTRA_COMPILERS += cuda
#}


LIBS += -LE:\Maidipu\code\tensorflow_1_8_gpu\lib -ltensorflow




INCLUDEPATH +=D:\Code-software\opencv_3_3_0\build\install\include \
              D:\Code-software\opencv_3_3_0\build\install\include\opencv2\
              D:\Code-software\opencv_3_3_0\build\install\include\opencv \
              E:\Maidipu\code\tensorflow_1_8_gpu\include


LIBS += -LD:\Code-software\opencv_3_3_0\build\install\x64\vc12\lib -lopencv_core330 -lopencv_imgproc330 \
        -lopencv_imgcodecs330 -lopencv_highgui330