Mask Rcnn tensorflow(keras前端)模型 c++预测 windows系统

<1> 背景:

先介绍写这篇博客的目的,因为本人是个gayhub搬运工,在搜索如标题的代码发现好难找得到,而且几乎好难找到,找了好久,找了好多大佬的代码,再加上本人的辣鸡代码(自己都看不下去)终于跑通了如标题所示功能,虽然本人的代码有点辣鸡,但是想用来跑一跑还是ok的。由于本人是第一次接触tensorflow(pytorch,caffe脑残粉),keras只会调用,加上看得的论文不多,可能有些理解不到位,所以代码写的很辣鸡,请多包涵,因为毕竟自己是搬运了很多大佬的工作才能完成,所以也希望能帮上其他人,毕竟踩坑踩多了很累。。。。

<2>首先说下里面涉及到的操作:

1.keras模型->tensorflow。因为模型是keras训练的(gayhub:GitHub - matterport/Mask_RCNN: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow),c++调用keras模型有人做,但当时做的时候感觉很多坑,所以我的做法还是倾向keras转tensorflow,再用c++调用

2.c++调用mask rcnn tensorflow模型,主要涉及到输入tensor和输出tensor的操作,还有矩阵操作库Eigen3(这个矩阵库用好了对以后很有帮助)的操作

3.注意事项,这里面用的tensorflow c++库是gpu还是cpu,以及模型batchsize的大小,都要跟你用python tensorflow训练好后保存的模型时用的batchsize以及是gpu还是cpu等操作保持一致。比如你训练好后保存的模型batchsize是32,那么用c++推理时就用batchsize=32

<3>环境配置:

1 系统:windows64位(很少在windows下编程),最好用64位的,因为用的tensorflow1.8好像只支持64位系统,不知道其他的版本怎样。编译器是使用msvc 2015 x64(感觉这个好难用。因为以前在linux下写程序多,一下子转不过来。。) ,IDE是qt4.8

2.安装好protobuf库,这一步主要是为了能够配置gpu,比如选择显卡号,如何使用显存等操作,如果没有这个库会导致出错,显示没有tensorflow::**protobuf**(具体忘了什么名字)的错误,如果你没有使用到前面描述的操作就可以不用安装。(我安装的是protobuf 3.6.1版本的,之前试了几个版本都安装出错,也是挺坑的,如果我编译好的库不能使用,请自行搜索安装

3.安装好显卡驱动和cuda(我的是9.1),正如我刚才所说,我身为一个汽车维修员,有一个锤子在身边,也很合逻辑。走错片场,,这一步主要是我使用的是gpu版本的tensorflow库,并且涉及到配置显卡,如果你使用的是cpu版本的tensorflow就不用进行2,3步了

4.因为我用的是gpu版本的tensorflow库,大佬编译好的(GitHub - fo40225/tensorflow-windows-wheel: Tensorflow prebuilt binary for Windows)我的是1.8.0 avx2 gpu

5.opencv,版本3.3.0(这个自己安装应该没问题),主要是测试读图,以及从mat转到tensor时用到,不过如果你要读取超大图像,比如数字病理图像,推荐大家使用libvips(这个库很强大),还有openslide(这个用python操作还不错,c++的没试过,因为编译看起来有很多坑的样子)

6.linux系统安装好tensorflow和keras前端,因为训练一般是在服务器上的linux系统.这一步主要给keras转tensorflow模型用

<4>开始踩坑

  • 在训练模型的机器上把keras模型转tensorflow模型:

  • 下载并按照提示安装使用keras前端的Mask rcnn GitHub - matterport/Mask_RCNN: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow,以及转换keras模型到tensorflow模型用的 GitHub - parai/Mask_RCNN: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
  • 保存训练好的模型:先修改matterport的Mask_RCNN/samples/coco/coco.py,主要是三步
    • 1
    • 2 修改inference  config配置,这一步很重要,比如GPU_COUNT 和IMAGES_PER_GPU这两个参数必须与parai的Mask_RCNN-master/samples/demo.py的CocoConfig一致,否则会出错,IMAGES_PER_GPU这个参数涉及到预测时图片的张数,我的是32张,是批预测,显卡是1080ti测了下512*512的图片可以跑32张
    • 3 修改类别数目
    • 4执行coco.py文件,这样就保存为keras模型(模型+权重),我的文件名是mask_rcnn_whole_batch32_new20.h5
    • 下面是我的CocoConfig,没有大改,主要是IMAGE_MIN_DIM = 512 和IMAGE_MAX_DIM = 512
      class CocoConfig(Config):
          """Configuration for training on MS COCO.
          Derives from the base Config class and overrides values specific
          to the COCO dataset.
          """
          # Give the configuration a recognizable name
          NAME = "coco"
      #
      #    # We use a GPU with 12GB memory, which can fit two images.
      #    # Adjust down if you use a smaller GPU.
      #    IMAGES_PER_GPU = 2
      
          # Uncomment to train on 8 GPUs (default is 1)
          # GPU_COUNT = 8
      
          # Number of classes (including background)
          #NUM_CLASSES = 1 + 6 # 6 # COCO has 80 classes
          NUM_CLASSES = 1 + 20 # 6 # COCO has 80 classes
      
          # NUMBER OF GPUs to use. For CPU training, use 1
          GPU_COUNT = 1
      
          # Number of images to train with on each GPU. A 12GB GPU can typically
          # handle 2 images of 1024x1024px.
          # Adjust based on your GPU memory and image sizes. Use the highest
          # number that your GPU can handle for best performance.
          IMAGES_PER_GPU = 2
      
          # Number of training steps per epoch
          # This doesn't need to match the size of the training set. Tensorboard
          # updates are saved at the end of each epoch, so setting this to a
          # smaller number means getting more frequent TensorBoard updates.
          # Validation stats are also calculated at each epoch end and they
          # might take a while, so don't set this too small to avoid spending
          # a lot of time on validation stats.
          STEPS_PER_EPOCH = 2000# 16962
      
          # Number of validation steps to run at the end of every training epoch.
          # A bigger number improves accuracy of validation stats, but slows
          # down the training.
          VALIDATION_STEPS = 4241
      
          # Backbone network architecture
          # Supported values are: resnet50, resnet101.
          # You can also provide a callable that should have the signature
          # of model.resnet_graph. If you do so, you need to supply a callable
          # to COMPUTE_BACKBONE_SHAPE as well
          BACKBONE = "resnet101"
      
          # Only useful if you supply a callable to BACKBONE. Should compute
          # the shape of each layer of the FPN Pyramid.
          # See model.compute_backbone_shapes
          COMPUTE_BACKBONE_SHAPE = None
      
          # The strides of each layer of the FPN Pyramid. These values
          # are based on a Resnet101 backbone.
          BACKBONE_STRIDES = [4, 8, 16, 32, 64]
      
          # Size of the fully-connected layers in the classification graph
          FPN_CLASSIF_FC_LAYERS_SIZE = 1024
      
          # Size of the top-down layers used to build the feature pyramid
          TOP_DOWN_PYRAMID_SIZE = 256
      
          # Length of square anchor side in pixels
          RPN_ANCHOR_SCALES = (32, 64, 128, 256, 512)
      
          # Ratios of anchors at each cell (width/height)
          # A value of 1 represents a square anchor, and 0.5 is a wide anchor
          RPN_ANCHOR_RATIOS = [0.5, 1, 2]
      
          # Anchor stride
          # If 1 then anchors are created for each cell in the backbone feature map.
          # If 2, then anchors are created for every other cell, and so on.
          RPN_ANCHOR_STRIDE = 1
      
          # Non-max suppression threshold to filter RPN proposals.
          # You can increase this during training to generate more propsals.
          RPN_NMS_THRESHOLD = 0.8
      
          # How many anchors per image to use for RPN training
          RPN_TRAIN_ANCHORS_PER_IMAGE = 256
          
          # ROIs kept after tf.nn.top_k and before non-maximum suppression
          PRE_NMS_LIMIT = 2000
          
          # ROIs kept after non-maximum suppression (training and inference)
          POST_NMS_ROIS_TRAINING = 2000
          POST_NMS_ROIS_INFERENCE = 1000
      
          # If enabled, resizes instance masks to a smaller size to reduce
          # memory load. Recommended when using high-resolution images.
          USE_MINI_MASK = True
          MINI_MASK_SHAPE = (56, 56)  # (height, width) of the mini-mask
      
          # Input image resizing
          # Generally, use the "square" resizing mode for training and predicting
          # and it should work well in most cases. In this mode, images are scaled
          # up such that the small side is = IMAGE_MIN_DIM, but ensuring that the
          # scaling doesn't make the long side > IMAGE_MAX_DIM. Then the image is
          # padded with zeros to make it a square so multiple images can be put
          # in one batch.
          # Available resizing modes:
          # none:   No resizing or padding. Return the image unchanged.
          # square: Resize and pad with zeros to get a square image
          #         of size [max_dim, max_dim].
          # pad64:  Pads width and height with zeros to make them multiples of 64.
          #         If IMAGE_MIN_DIM or IMAGE_MIN_SCALE are not None, then it scales
          #         up before padding. IMAGE_MAX_DIM is ignored in this mode.
          #         The multiple of 64 is needed to ensure smooth scaling of feature
          #         maps up and down the 6 levels of the FPN pyramid (2**6=64).
          # crop:   Picks random crops from the image. First, scales the image based
          #         on IMAGE_MIN_DIM and IMAGE_MIN_SCALE, then picks a random crop of
          #         size IMAGE_MIN_DIM x IMAGE_MIN_DIM. Can be used in training only.
          #         IMAGE_MAX_DIM is not used in this mode.
          IMAGE_RESIZE_MODE = "square"
          IMAGE_MIN_DIM = 512 #我的图片是512*512
          IMAGE_MAX_DIM = 512
          # Minimum scaling ratio. Checked after MIN_IMAGE_DIM and can force further
          # up scaling. For example, if set to 2 then images are scaled up to double
          # the width and height, or more, even if MIN_IMAGE_DIM doesn't require it.
          # Howver, in 'square' mode, it can be overruled by IMAGE_MAX_DIM.
          IMAGE_MIN_SCALE = 0
      
          # Image mean (RGB)
          MEAN_PIXEL = np.array([123.7, 116.8, 103.9])
      
          # Number of ROIs per image to feed to classifier/mask heads
          # The Mask RCNN paper uses 512 but often the RPN doesn't generate
          # enough positive proposals to fill this and keep a positive:negative
          # ratio of 1:3. You can increase the number of proposals by adjusting
          # the RPN NMS threshold.
          TRAIN_ROIS_PER_IMAGE = 200
      
          # Percent of positive ROIs used to train classifier/mask heads
          ROI_POSITIVE_RATIO = 0.33
      
          # Pooled ROIs
          POOL_SIZE = 7
          MASK_POOL_SIZE = 14
      
          # Shape of output mask
          # To change this you also need to change the neural network mask branch
          MASK_SHAPE = [28, 28]
      
          # Maximum number of ground truth instances to use in one image
          MAX_GT_INSTANCES = 100
      
          # Bounding box refinement standard deviation for RPN and final detections.
          RPN_BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])
          BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])
      
          # Max number of final detections
          DETECTION_MAX_INSTANCES = 100
      
          # Minimum probability value to accept a detected instance
          # ROIs below this threshold are skipped
          DETECTION_MIN_CONFIDENCE = 0.8
      
          # Non-maximum suppression threshold for detection
          DETECTION_NMS_THRESHOLD = 0.3
      
          # Learning rate and momentum
          # The Mask RCNN paper uses lr=0.02, but on TensorFlow it causes
          # weights to explode. Likely due to differences in optimizer
          # implementation.
          LEARNING_RATE = 0.001
          LEARNING_MOMENTUM = 0.9
      
          # Weight decay regularization
          WEIGHT_DECAY = 0.0001
      
          # Loss weights for more precise optimization.
          # Can be used for R-CNN training setup.
          LOSS_WEIGHTS = {
              "rpn_class_loss": 1.,
              "rpn_bbox_loss": 1.,
              "mrcnn_class_loss": 1.,
              "mrcnn_bbox_loss": 1.,
              "mrcnn_mask_loss": 1.
          }
      
          # Use RPN ROIs or externally generated ROIs for training
          # Keep this True for most situations. Set to False if you want to train
          # the head branches on ROI generated by code rather than the ROIs from
          # the RPN. For example, to debug the classifier head without having to
          # train the RPN.
          USE_RPN_ROIS = True
      
          # Train or freeze batch normalization layers
          #     None: Train BN layers. This is the normal mode
          #     False: Freeze BN layers. Good when using a small batch size
          #     True: (don't use). Set layer in training mode even when predicting
          TRAIN_BN = False  # Defaulting to False since batch size is often small
      
          # Gradient norm clipping
          GRADIENT_CLIP_NORM = 5.0

  • keras模型转tensorflow模型,主要是让parai的Mask_RCNN-master/samples/demo.py的几个参数与matterport的Mask_RCNN/samples/coco/coco.py的参数一致

  • c++调用 mask rcnn tensorflow模型,这一步主要是对tensorflow::tensor以及Eigen::tensor(eigen3)等一系列操作

    • 先来看看mask rcnn输入用到的tensor,从parai的Mask_RCNN-master/infere_from_pb.py即下图可以看到主要对应三个输入,分别是img_ph,img_anchors_ph,img_meta_ph这个三个键,它们的值分别是molded_images, image_metas, image_anchors,,,对应的tensorflow::tensor名称分别是input_image_1,input_anchors_1,input_image_meta_1,那么我们就要c++构建这三个tensorflow::tensor,可以根据infere_from_pb.py的def mold_inputs(images) 来查看molded_images, image_metas这两个值如何来的,根据def get_anchors(image_shape, config):函数查看image_anchors如何来的, ,注意以后对tensor的操作加上'tensorflow::'这个域操作符号,因为eigen3里面也有个tensor,防止混淆
        • 构建input_image_1的tensorflow::tensor,input_image_1这个键对应的是值molded_images,molded_images就是把图像转换成tensorflow::tensor,这一步是cv::Mat->tensor,因为我用的是批预测,所以图片是存在cv::Mat容器里面的,但是不是每个批次都是满的,所以多了个imgNum_actual 参数控制实际中要转换的cv::mat参数
          • void detectBatch::CVMats_to_Tensor(std::vector<cv::Mat> &imgs, tensorflow::Tensor *input_tensor, size_t &imgNum_actual)
            {
                /*
                    *Function:  CVMats_to_Tensor
                    *Description:  cv::mat图像容器转到tensorflow::tensor
                    *Calls:
                        1. ****
                    *Called By:
                      1. ****
            
                    *InputList:
                      1. imgs 存储cv::mat的图像容器 std::vector<cv::Mat> &
                      2. input_tensor 要存储数据的tensor tensorflow::Tensor *
                      3. imgNum_actual 实际要转换cv::mat张数 size_t &
            
                    *OutPut:
                      1. NULL
                */
            
                auto outputMap =input_tensor->tensor<float,4>();//获取tensor指针,注意这里outputMap是Eigen::tensor类型
                for(size_t b=0;b<imgNum_actual;b++)//遍历图像张数
                {
            
                    for(int r=0;r<outputMap.dimension(1);r++)//遍历行数
                    {
                        for(int c=0;c<outputMap.dimension(2);c++)//遍历列数
                        {
                            //note that opencv mat image channel is B G R
                            //减去均值
                            outputMap(b,r,c,0)=imgs[b].at<cv::Vec3b>(r,c)[2]-MEAN_PIXEL[0];//R
                            outputMap(b,r,c,1)=imgs[b].at<cv::Vec3b>(r,c)[1]-MEAN_PIXEL[1];//G
                            outputMap(b,r,c,2)=imgs[b].at<cv::Vec3b>(r,c)[0]-MEAN_PIXEL[2];//B
                        }
            
                    }
                }
            
            }

        • 构建input_image_meta_1 的tensorflow::tensor,先来看看input_image_meta_1是什么东西,它对应infere_from_pb.py中的img_meta_ph这个键,而值是image_metas,可以从def mold_inputs(images)函数中看到image_metas只是对图像数据信息的包装,比如长宽通道数等等。可以看到image_metas数据就是N*(length of meta data)的二维列表,第一个维度N是预测的batch_size,然后第二个维度是单张图像meta的数据,因为送进去batch里面的图像都是尺寸一样的,所以我们只需构建一个meta数据就行了,其他直接复制后构成一个N*(length of meta data)二维列表就行了
          • def mold_inputs(images):
                    """Takes a list of images and modifies them to the format expected
                    as an input to the neural network.
                    images: List of image matricies [height,width,depth]. Images can have
                        different sizes.
                    Returns 3 Numpy matricies:
                    molded_images: [N, h, w, 3]. Images resized and normalized.
                    image_metas: [N, length of meta data]. Details about each image.#可以看到image_metas数据就是N*(length of meta data)的列表,N是预测的batch_size,
                    windows: [N, (y1, x1, y2, x2)]. The portion of the image that has the
                        original image (padding excluded).
                    """
                    molded_images = []
                    image_metas = []
                    windows = []
                    for image in images:
                        # Resize image to fit the model expected size
                        # TODO: move resizing to mold_image()
                        molded_image, window, scale, padding, corp = utils.resize_image(
                            image,
                            min_dim=inference_config.IMAGE_MIN_DIM,
                            min_scale=inference_config.IMAGE_MIN_SCALE,
                            max_dim=inference_config.IMAGE_MAX_DIM,
                            mode=inference_config.IMAGE_RESIZE_MODE)
            
                        print(image.shape)
                        print('Image resized at: ', molded_image.shape)
                        print(window)
                        print(scale)
                        """Takes RGB images with 0-255 values and subtraces
                               the mean pixel and converts it to float. Expects image
                               colors in RGB order."""
                        molded_image = mold_image(molded_image, inference_config)
                        print('Image molded')
                        #print(a)
                        """Takes attributes of an image and puts them in one 1D array."""
                        inference_config.NUM_CLASSES = 81
                        #下面这个函数开始构造image_meta,我们仿造这个函数构建就行了
                        image_meta = compose_image_meta( 
                            0, image.shape, molded_image.shape, window, scale,
                            np.zeros([inference_config.NUM_CLASSES], dtype=np.int32))
                        print('Meta of image prepared')
                        image_anchor = [] # TODO
                        # Append
                        molded_images.append(molded_image)
                        windows.append(window)
                        image_metas.append(image_meta)
                    # Pack into arrays
                    molded_images = np.stack(molded_images)
                    image_metas = np.stack(image_metas)
                    windows = np.stack(windows)
                    return molded_images, image_metas, windows

            可以看到上面的函数里面真正起作用的是compose_image_meta这个函数,我们可以看一下它的实现

            def compose_image_meta(image_id, original_image_shape, image_shape,
                                   window, scale, active_class_ids):
                """Takes attributes of an image and puts them in one 1D array.
            
                image_id: An int ID of the image. Useful for debugging.
                original_image_shape: [H, W, C] before resizing or padding.
                image_shape: [H, W, C] after resizing and padding
                window: (y1, x1, y2, x2) in pixels. The area of the image where the real
                        image is (excluding the padding)
                scale: The scaling factor applied to the original image (float32)
                active_class_ids: List of class_ids available in the dataset from which
                    the image came. Useful if training on images from multiple datasets
                    where not all classes are present in all datasets.
                """
                meta = np.array(
                    [image_id] +                  # size=1
                    list(original_image_shape) +  # size=3
                    list(image_shape) +           # size=3
                    list(window) +                # size=4 (y1, x1, y2, x2) in image cooredinates
                    [scale] +                     # size=1
                    list(active_class_ids)        # size=num_classes
                )
                return meta

            可以看到meta数据结构如下
                    [image_id] +                  # size=1 长度为1,值的话可以赋值为0
                    list(original_image_shape) +  # size=3 长度为3,主要是原始图像的h,w,c这个三个参数
                    list(image_shape) +           # size=3  长度为3,主要是resize后图像的h,w,c这个三个参数,注意因为我是裁剪好送进去,所以实际上这里original_image_shape和image_shape是一样的
                    list(window) +                # size=4 (y1, x1, y2, x2) in image cooredinates  窗口的坐标,就是显示用的,x1,y1都是0,x2,y2是窗口的大小,但因为是裁剪好送进去,所以x2,y2和图像的h,w一样,这里偷懒了,我不想预测得到坐标后还得经过计算映射到窗口或原图
                    [scale] +                     # size=1 缩放比例,主要是resize后图像的长边/resize前图像的长边,如上所诉因为是裁剪好的,resize前后的图像一致,所以这里其实是比例是1
                    list(active_class_ids) #类别(包括背景)构成列表,里面的元素根据def mold_inputs(images)的操作全部赋值为0
            那么构成的c++代码如下
             

            void detectBatch::compose_image_meta()
            {
                /*
                    *Function:  compose_image_meta
                    *Description:  计算图像meta数据
                    *Calls:
                        1. ****
                    *Called By:
                      1. ****
            
                    *InputList:
                      1. NULL
            
                    *OutPut:
                      1. NULL
                */
                int imglongSide,inputlongSide;
                image_meta[0]=0;
                //original_image_shape: [H, W, C] before resizing or padding.
                image_meta[1]=inputImg_h;
                image_meta[2]=inputImg_w;
                image_meta[3]=inputImg_c;
                imglongSide=image_meta[1]>=image_meta[2]?image_meta[1]:image_meta[2];
            
            
                //image_shape: [H, W, C] after resizing and padding
                image_meta[4]=input_height;
                image_meta[5]=input_width;
                image_meta[6]=input_channels;
                inputlongSide=image_meta[4]>=image_meta[5]?image_meta[4]:image_meta[5];
            
                //window: (y1, x1, y2, x2) in pixels. The area of the image where the real image is (excluding the padding)
                image_meta[7]=0;
                image_meta[8]=0;
                image_meta[9]=input_height;//因为我的图像都是裁剪好再送进去的,所以窗口的长宽与实际图像长宽一致
                image_meta[10]=input_width;
            
                //scale: The scaling factor applied to the original image (float32)
                image_meta[11]=inputlongSide/imglongSide;
            
                //active_class_ids: List of class_ids available in the dataset from which the image came.
                for(int i=TF_MASKRCNN_IMAGE_METADATA_LENGTH-num_classes;i<TF_MASKRCNN_IMAGE_METADATA_LENGTH;i++)
                {
                    image_meta[i]=0;
                }
            
                inputMetadataTensor=tensorflow::Tensor(tensorflow::DT_FLOAT, {batch_size, TF_MASKRCNN_IMAGE_METADATA_LENGTH});
            
                auto inputMetadataTensorMap=inputMetadataTensor.tensor<float,2>();
                for(int j=0;j<batch_size;j++)
                {
                    for(int i=0;i<TF_MASKRCNN_IMAGE_METADATA_LENGTH;i++)
                    {
                        //std::cout<<"image_meta["<<i<<"] is "<<image_meta[i]<<std::endl;
                        inputMetadataTensorMap(j,i)=image_meta[i];
            
                    }
                }
            
            }

        • 构建input_anchors_1的tensorflow::tensor,先来看看input_anchors_1是什么东西,它对应infere_from_pb.py中的img_anchors_ph这个键,而值是image_anchors,可以从def get_anchors(image_shape, config)函数中看到image_anchors如何来的。这一步是最复杂的,同时也是最重要的,了解这一部分的代码对mask rcnn的理解很有帮助。

          可以从infere_from_pb.py看def get_anchors(image_shape, config)函数
          def get_anchors(image_shape, config):
              """Returns anchor pyramid for the given image size."""
              backbone_shapes = compute_backbone_shapes(config, image_shape)#计算
          #输入图像经过backbone的每一个阶段(可能是pooling或者conv等down sample操作导
          #致feature map缩小后的尺寸,这一部分没细看)后feature图的形状(长宽)
              # Cache anchors and reuse if image shape is the same
              _anchor_cache = {}
              if not tuple(image_shape) in _anchor_cache:#先判断,如果之前计算
          #过同样尺寸图像的的anchor,就不用重新计算,直接取_anchor_cache里面存储好的
          #上一次的anchor就行,感觉作者这个操作很细腻,以前我都没想过这种操作,不过我的
          #输入图像是因为事先裁剪好的图像,即我的图像都是512*512的,所以我这边只要计算
          #好anchor后,后面不用判断也不用再次计算,这样感觉再处理超大图像时可以省时间,
          #不过我没测试过。。
                  # Generate Anchors
                  a = utils.generate_pyramid_anchors(
                      config.RPN_ANCHOR_SCALES,
                      config.RPN_ANCHOR_RATIOS,
                      backbone_shapes,
                      config.BACKBONE_STRIDES,
                      config.RPN_ANCHOR_STRIDE)
                  # Keep a copy of the latest anchors in pixel coordinates because
                  # it's used in inspect_model notebooks.
                  # TODO: Remove this after the notebook are refactored to not use it
                  anchors = a
                  # Normalize coordinates
                  _anchor_cache[tuple(image_shape)] = utils.norm_boxes(a, image_shape[:2])
              return _anchor_cache[tuple(image_shape)]
          • 从上的代码可以看出构成anchors主要是utils.generate_pyramid_anchors这个函数,里面的五个参数:
            主要看backbone_shapes这个参数以及utils.generate_pyramid_anchors函数本身,其余四个参数在class CocoConfig(Config)里面可以看到只是简单的数组列表
            • 第一先看backbone_shapes怎么来的,它是从 compute_backbone_shapes(config, image_shape)得来的,代码如下
              def compute_backbone_shapes(config, image_shape):
                  """Computes the width and height of each stage of the backbone network.
                  
                  Returns:
                      [N, (height, width)]. Where N is the number of stages
                  """
                  # Currently supports ResNet only
                  assert config.BACKBONE in ["resnet50", "resnet101"]
                  return np.array(
                      [[int(math.ceil(image_shape[0] / stride)),
                          int(math.ceil(image_shape[1] / stride))]
                          for stride in config.BACKBONE_STRIDES])
              可以看到上面代码,先看image_shape这个参数,这个其实是输入图像(resize后的)的尺寸,主要是行数列数(高度和宽度),返回的是一个[N*(height,width)]的数组,N就是backbone网络的阶段数目(这里指的是提取出anchors的那一层featuremap,一层为一个阶段,这个可以从maskrcnn论文看出,因为mask rcnn就是从backbone网络的其中几个阶段的featuremap提取anchors的,论文设置是从5个阶段featuremap提取anchors的,这点从class CocoConfig(Config)的BACKBONE_STRIDES = [4, 8, 16, 32, 64]看到就是5个阶段),至于后面的image_shape[0] / stride等除法操作,其实就是计算该阶段featuremap的长宽,所以这里的BACKBONE_STRIDES = [4, 8, 16, 32, 64]其实就是表示图像到达这个阶段时尺寸缩小的倍数,比如是如下图像是512*512,那么经过第一个阶段就变成128*128,以此类推就是64*64,32*32,16*16,8*8, ,那么[N*(height,width)]数组返回的就是[ [128,128],[64,64],[32,32],[16,16],[8,8] ]
              那么生成[N*(height,width)]数组的c++部分代码如下:
              float BACKBONE_STRIDES[5]={4, 8, 16, 32, 64};//用于计算输入图像经过backbone的每一个阶段(可能是pooling或者conv等down sample操作导致feature map缩小后的尺寸,这一部分没细看)后feature图的长宽
              
              int backbone_strides_num =5;
              int backbone_shape[5][2];//for backbone_shape
              for(int i=0;i<backbone_strides_num;i++)
                      {
                          backbone_shape[i][0]=ceil(inputImg_h/BACKBONE_STRIDES[i]);
                          backbone_shape[i][1]=ceil(inputImg_w/BACKBONE_STRIDES[i]);
                      }

            • 第二步,看utils.generate_pyramid_anchors这个函数做了什么操作,这个函数在matterport和pari的Mask_RCNN/mrcnn的utils文件都有,因为我们一开始安装的是matterport的mask rcnn,所以看他的就行了,这个函数代码如下
              def generate_pyramid_anchors(scales, ratios, feature_shapes, feature_strides,
                                           anchor_stride):
                  """Generate anchors at different levels of a feature pyramid. Each scale
                  is associated with a level of the pyramid, but each ratio is used in
                  all levels of the pyramid.
              
                  Returns:
                  anchors: [N, (y1, x1, y2, x2)]. All generated anchors in one array. Sorted
                      with the same order of the given scales. So, anchors of scale[0] come
                      first, then anchors of scale[1], and so on.
                  """
                  # Anchors
                  # [anchor_count, (y1, x1, y2, x2)]
                  anchors = []
                  for i in range(len(scales)):
                      anchors.append(generate_anchors(scales[i], ratios, feature_shapes[i],
                                                      feature_strides[i], anchor_stride))
                  return np.concatenate(anchors, axis=0)
              
              

              可以看出里面主要是generate_anchors函数起作用,那么我们就去看generate_anchors函数,这个函数也在utils文件里面,函数如下
              
              def generate_anchors(scales, ratios, shape, feature_stride, anchor_stride):
                  """
                  scales: 1D array of anchor sizes in pixels. Example: [32, 64, 128]
                  ratios: 1D array of anchor ratios of width/height. Example: [0.5, 1, 2]
                  shape: [height, width] spatial shape of the feature map over which
                          to generate anchors.
                  feature_stride: Stride of the feature map relative to the image in pixels.
                  anchor_stride: Stride of anchors on the feature map. For example, if the
                      value is 2 then generate anchors for every other feature map pixel.
                  """
                  # Get all combinations of scales and ratios
                  #对所有的尺度和缩放因子进行组合
                  #配置文件中尺度和缩放因子如下
                  #RPN_ANCHOR_SCALES = (32, 64, 128, 256, 512)
                  #RPN_ANCHOR_RATIOS = [0.5, 1, 2]
                  
                  #可以去看下np.meshgrid(x1,x2)函数的操作,其实就是为了x1,x2的元素进行两两配对(仅仅是位置)而返回一个矩阵组合,假设外围循环传进来的是scales参数是RPN_ANCHOR_SCALES [0]
                  
                  scales, ratios = np.meshgrid(np.array(scales), np.array(ratios))
                  '''
              
                  
                  经过np.meshgrid后,scales= array([32],
                                                   [32],
                                                   [32])
                                    ratios = array( [0.5],
                                                    [1],
                                                    [ 2])
                  '''
              
                  '''
                  对scales和ratios先进行平铺,例如操作后,scales=([32,32,32]) ,
                  '''
                  scales = scales.flatten()
                  ratios = ratios.flatten()
              
                  '''
                  多个尺寸的anchors进行多个ratio的缩放,得到多个不同比例组合anchors的宽高
                  '''
                  # Enumerate heights and widths from scales and ratios
                  heights = scales / np.sqrt(ratios)
                  widths = scales * np.sqrt(ratios)
              
                  '''
                  计算得到多个不同偏移量y,x坐标,就是anchors中心点位置,shape[0],shape[1]是featuremap的宽
                  高(注意是经过上面的以config.BACKBONE_STRIDES={4, 8, 16, 32, 64}为比例缩放后的),        
                  anchor_stride在配置中是1,就是anchors之间的间隔为1,    
                  feature_stride就是config.BACKBONE_STRIDES={4, 8, 16, 32, 64}的元素遍历,乘以
                  feature_stride相当于把np.arange(0, shape[0], anchor_stride)的元素映射回原图(我的是
                  512*512)的坐标
                  
                  '''
                  # Enumerate shifts in feature space
                  shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
                  shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride
              
                  '''
                  中心点位置进行两两组合就可以得到不同组合的bbox的中心点组合(仅仅是位置),真正配对在下面
                  '''
                  shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y)
              
                  # Enumerate combinations of shifts, widths, and heights
              
              
                  '''
                  宽高分别与中心点的x,y坐标进行两两组合就可以得到不同组合的bbox的中心点组合
                  这里只简单给出原理,结合meshgrid操作来看
                  比如:由上述可知widths是一维数组,widths=[e,f,g] shifts_x=[[0,1],[2,3],[4,5]]
                  那么经过np.meshgrid(widths, shifts_x)后,box_widths=[[e,f,g],          
                                                                      [e,f,g],
                                                                      [e,f,g],
                                                                      [e,f,g],
                                                                      [e,f,g],
                                                                      [e,f,g] ]
                  box_widths的shape就是6x3,          同理 box_center_x=[[0,0,0],          
                                                                      [1,1,1],
                                                                      [2,2,2],
                                                                      [3,3,3],
                                                                      [4,4,4],
                                                                      [5,5,5] ]
                  可以看出就是x与width两两配对,y与height两两配对(仅仅是位置)
                  '''
                  box_widths, box_centers_x = np.meshgrid(widths, shifts_x)
                  box_heights, box_centers_y = np.meshgrid(heights, shifts_y)
                  
                  '''
                  接下来是np.stack,np.stack用法可以参加官网或者    
                  https://blog.csdn.net/wgx571859177/article/details/80987459,如果自己实践一下更能体会它
                  的用法
                  假设box_center_x还是上面说的形状,也就是6x3,box_center_y也是同样的形状(假设其元素是6,7,8,9,10,11),那么经过下面的np.stack
                  后,box_centers形状就是6x3x2,这里就是真正的配对了,要这样看,6x(3x2),那么box_centers的形式    
                  如下
                                                          box_centers=[
                                                                      [[0,6],[0,6],[0,6]],          
                                                                      [[1,7],[1,7],[1,7]],
                                                                      [[2,8],[2,8],[2,8]],
                                                                      [[3,9],[3,9],[3,9]],
                                                                      [[4,10],[4,10],[4,10]],
                                                                      [[5,11],[5,11],[5,11]] ]
                  最后再进行reshape([-1,2])后box_centers=[[0,6],[0,6],[0,6],[1,7],...]的形式,形状是
                  18x2,
                  box_sizes同理,只不过里面元素是宽高,
                  '''
                  # Reshape to get a list of (y, x) and a list of (h, w)
                  box_centers = np.stack(
                      [box_centers_y, box_centers_x], axis=2).reshape([-1, 2])
                  box_sizes = np.stack([box_heights, box_widths], axis=2).reshape([-1, 2])
                  '''
                  box_centers - 0.5 * box_sizes,box_centers - 0.5 * box_sizes就是得出左上右下角点的x,y坐
                  标,进行np.concatenate(axis=1)后,boxes的形状就是18x4,4代表左上右下角点的x,y坐标,形式是
                  18x(y1, x1, y2, x2),这里的18就是anchor的个数(这里是假设的,论文不是这个数值,得看配置然后
                  经过上面的计算就知道个数了,只是为了方便演示),(y1, x1, y2, x2)是不同形式的anchor,
                  '''
                  # Convert to corner coordinates (y1, x1, y2, x2)
                  boxes = np.concatenate([box_centers - 0.5 * box_sizes,
                                          box_centers + 0.5 * box_sizes], axis=1)
                  return boxes

              而这一段的c++代码如下:
               

              int finalBoxesRows=0;//用于统计五个RPN_ANCHOR_SCALES尺度对应的所有boxes的行数,可以先不看这个
              
                      //generate_pyramid_anchors //生成不同尺度(配置参数中是5个)的anchor
                      for(int j=0;j<rpn_anchor_scales_num;j++)
                      {
                          //generate_anchors
              
                          //Get all combinations of scales and ratios
                          Eigen::RowVectorXf scalesVec(1);//遍历并且临时存储RPN_ANCHOR_SCALES[5]={32, 64, 128, 256, 512}的每个元素,主要给scalesMat赋值用
                          Eigen::VectorXf ratiosVec(rpn_anchor_ratios_num);
                          Eigen::MatrixXf scalesMat=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();
                          Eigen::MatrixXf ratiosMat=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();
                          Eigen::MatrixXf heightsMat;//=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();
                          Eigen::MatrixXf widthsMat;//=Eigen::MatrixXf(rpn_anchor_ratios_num, 1);//();
              
              
                          //以下步骤主要是实现python中的
                          /*
                           scales, ratios = np.meshgrid(np.array(scales), np.array(ratios))
                           */
              
                          scalesVec(0)=(RPN_ANCHOR_SCALES[j]);
              
                          //构造np.array(ratios)
                          for(int i=0;i<rpn_anchor_ratios_num;i++)
                          {
                              ratiosVec(i)=RPN_ANCHOR_RATIOS[i];
                          }
                          for(int i=0;i<ratiosMat.cols();i++)
                          {
                              ratiosMat.col(i)<<ratiosVec;
                          }
              
              
                          //构造np.array(scales)
                          //std::cout<<"scalesMat is <<"<<scalesMat.cols()<<std::endl;
                          for(int i=0;i<scalesMat.rows();i++)
                          {
                              scalesMat.row(i)<<scalesVec;
              
                          }
              
              
                          //构造heights,widths,这两个在python里面是长度为3的向量,但为了后面的点乘等操作换成了3*1的矩阵
                          //python代码如下
                          /*
                              heights = scales / np.sqrt(ratios)
                              widths = scales * np.sqrt(ratios)
                           */
              
                          //Enumerate heights and widths from scales and ratios
                          heightsMat=scalesMat.cwiseQuotient(ratiosMat.cwiseSqrt());
                          widthsMat=scalesMat.cwiseProduct(ratiosMat.cwiseSqrt());
              
                          //构造shifts_x, shifts_y
                          //python代码如下
                          /*
                          shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
                          shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride
                          shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y)
                           */
                          //Enumerate shifts in feature space
                          //先进行   shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
                          //        shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride
              
                          int step=RPN_ANCHOR_STRIDE,low=0,hight_y=backbone_shape[j][0],hight_x=backbone_shape[j][1];//获取shape[0],shape[1],anchor_stride,
                          Eigen::RowVectorXf shifts_y;//行向量
                          Eigen::RowVectorXf shifts_x;
                          int realsize_y=((hight_y-low)/step);
                          int realsize_x=((hight_x-low)/step);
                          shifts_y.setLinSpaced(realsize_y,low,low+step*(realsize_y-1));
                          shifts_x.setLinSpaced(realsize_x,low,low+step*(realsize_x-1));
                          shifts_y*=BACKBONE_STRIDES[j];//获取feature_stride,这里的feature_stride其实是python代码中外围循环送进的参数BACKBONE_STRIDES[j]
                          shifts_x*=BACKBONE_STRIDES[j];//获取feature_stride,这里的feature_stride其实是python代码中外围循环送进的参数BACKBONE_STRIDES[j]
              
                          /*再进行   shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y),
                          构造出最终的shifts_x,shifts_y矩阵,注意经过np.meshgrid后shifts_x,shifts_y是二维的矩阵
                          */
                          //构造shifts_x,shifts_y矩阵
                          Eigen::MatrixXf shifts_xMat(shifts_y.cols(),shifts_x.cols())
                                  ,shifts_yMat(shifts_y.cols(),shifts_x.cols());
                          for(int i=0;i<shifts_xMat.rows();i++)
                          {
                              shifts_xMat.row(i)=shifts_x;
              
                          }
                          for(int i=0;i<shifts_yMat.cols();i++)
                          {
                              shifts_yMat.col(i)=shifts_y;
                          }
              
              
              
              
              
                          //进行python代码
                          /*
                              box_widths, box_centers_x = np.meshgrid(widths, shifts_x)
                              box_heights, box_centers_y = np.meshgrid(heights, shifts_y)
              
                              # Reshape to get a list of (y, x) and a list of (h, w)
                              box_centers = np.stack([box_centers_y, box_centers_x], axis=2).reshape([-1, 2])
                              box_sizes = np.stack([box_heights, box_widths], axis=2).reshape([-1, 2])
              
                              # Convert to corner coordinates (y1, x1, y2, x2)
                              boxes = np.concatenate([box_centers - 0.5 * box_sizes,
                                          box_centers + 0.5 * box_sizes], axis=1)
                              return boxes
              
                           */
                          //Enumerate combinations of shifts, widths, and heights
                          //先进行 box_widths, box_centers_x = np.meshgrid(widths, shifts_x)
                          //      box_heights, box_centers_y = np.meshgrid(heights, shifts_y)
                          //先把heightsMat,widthsMat换成行向量方便赋值,
                          Eigen::RowVectorXf heightsMatFlat(Eigen::Map<Eigen::VectorXf>(heightsMat.data(),heightsMat.rows()*heightsMat.cols()));
                          Eigen::RowVectorXf widthsMatFlat(Eigen::Map<Eigen::VectorXf>(widthsMat.data(),widthsMat.rows()*widthsMat.cols()));
              
                          /*因为上面的np.meshgrid(widths, shifts_x)
                          中widths是长度为3的向量,shifts_x是二维矩阵,所以np.meshgrid(widths, shifts_x)生成的矩阵列数是widths的长度
                          生成的矩阵行数是--shifts_x按照行方向平铺后的长度,假如shifts_x是2*3矩阵,那么就是6.而后面
                          box_widths, box_centers_x = np.meshgrid(widths, shifts_x)生成的box_centers_x的行数是shifts_x的行数*列数,box_centers_x每一列是shifts_x矩阵的元素按照行方向平铺后构成的,
                          但是因为eigen里面的矩阵是列优先存储,所以要在c++代码中对shifts_xMat(shift_x)进行转置,这样通过Eigen::Map映射到shifts_yMatFlat就是相当于把shifts_x矩阵的元素按照行方向平铺后构成的向量
                          同理对shifts_yMat进行同样的操作得到shifts_yMatFlat.
                          而box_widths,box_heights可以通过widthsMatFlat,和heightsMatFlat赋值得到,因为heightsMatFlat和box_heights可以通过widthsMatFlat
                          本身是一维的向量
                          */
                          shifts_xMat.transposeInPlace();
                          shifts_yMat.transposeInPlace();
                          Eigen::RowVectorXf shifts_yMatFlat(Eigen::Map<Eigen::VectorXf>(shifts_yMat.data(),shifts_yMat.rows()*shifts_yMat.cols()));
                          //Eigen::RowVectorXf shifts_xMatFlat(Eigen::Map<Eigen::VectorXf>(shifts_xMat.data(),shifts_xMat.rows()*shifts_xMat.cols(),Eigen::ColMajor));
                          Eigen::RowVectorXf shifts_xMatFlat(Eigen::Map<Eigen::VectorXf>(shifts_xMat.data(),shifts_xMat.rows()*shifts_xMat.cols()));
                          Eigen::MatrixXf box_widthsMat=Eigen::MatrixXf(shifts_xMatFlat.cols(),widthsMatFlat.cols());//();
                          Eigen::MatrixXf box_center_xMat=Eigen::MatrixXf(shifts_xMatFlat.cols(),widthsMatFlat.cols());//();
                          Eigen::MatrixXf box_heightsMat=Eigen::MatrixXf(shifts_yMatFlat.cols(),heightsMatFlat.cols());//();
                          Eigen::MatrixXf box_center_yMat=Eigen::MatrixXf(shifts_yMatFlat.cols(),heightsMatFlat.cols());//();
                          for(int i=0;i<box_widthsMat.rows();i++)
                          {
                              box_widthsMat.row(i)=widthsMatFlat;
                              box_heightsMat.row(i)=heightsMatFlat;
                          }
                          for(int i=0;i<box_heightsMat.cols();i++)
                          {
                              box_center_xMat.col(i)=shifts_xMatFlat;
                              box_center_yMat.col(i)=shifts_yMatFlat;
                          }
              
              
                          //Convert to corner coordinates (y1, x1, y2, x2)
                          // 'e for 's element abbreviation
                          //note that ,in the bellow,matrix's element which to be add or substract, is In the corresponding position
                          //python method: box_centers_y mat ,box_centers_x mat  stack to  mat A whose unit format is (box_center_y'e,box_center_x'e)
                          //then reshape to [-1,2],so the result is mat whose  col format is (box_center_y'e,box_center_x'e),box_sizes mat B is the same,col format is (box_height'e,box_width'e)
                          //then  A-B ,A+B get the mat C,D whose col format are  respectively  (box_center_y'e-box_height'e,box_center_x'e-box_width'e) and (box_center_y'e+box_height'e,box_center_x'e+box_width'e)
                          //then concat C and D get mat E whose col format is (box_center_y'e-box_height'e,box_center_x'e-box_width'e ,box_center_y'e+box_height'e,box_center_x'e+box_width'e)
                          //and that is (y1,x1,y2,x2)
                          //in eigen3,different to python
                          //first we have got the matrix box_center_yMat box_center_xMat box_heightsMat box_widthsMat
                          //for abbreviation is center_yMat,center_xMat,heightMat,widthMat
                          //center_yMat-0.5*heightMat=y1Mat
                          //center_yMat+0.5*heightMat=y2Mat
                          //center_xMat-0.5*widthMat=x1Mat
                          //center_xMat+0.5*widthMat=x2Mat
                          //then generate the matrix boxes whose col format is (y1Mat's e,x1Mat's e,y2Mat's e ,x2Mat's e),rows in the num
              
                          //进行如下操作
                          //boxes = np.concatenate([box_centers - 0.5 * box_sizes,
                          //box_centers + 0.5 * box_sizes], axis=1)
                          //boxes形式如[(y1, x1, y2, x2),...,...]
                          Eigen::MatrixXf y1Mat=box_center_yMat-box_heightsMat*0.5;
                          Eigen::MatrixXf x1Mat=box_center_xMat-box_widthsMat*0.5;
                          Eigen::MatrixXf y2Mat=box_center_yMat+box_heightsMat*0.5;
                          Eigen::MatrixXf x2Mat=box_center_xMat+box_widthsMat*0.5;
                          y1Mat.transposeInPlace();
                          x1Mat.transposeInPlace();
                          y2Mat.transposeInPlace();
                          x2Mat.transposeInPlace();
                          Eigen::RowVectorXf y1MatFlat(Eigen::Map<Eigen::VectorXf>(y1Mat.data(),y1Mat.rows()*y1Mat.cols()));
                          Eigen::RowVectorXf x1MatFlat(Eigen::Map<Eigen::VectorXf>(x1Mat.data(),x1Mat.rows()*x1Mat.cols()));
                          Eigen::RowVectorXf y2MatFlat(Eigen::Map<Eigen::VectorXf>(y2Mat.data(),y2Mat.rows()*y2Mat.cols()));
                          Eigen::RowVectorXf x2MatFlat(Eigen::Map<Eigen::VectorXf>(x2Mat.data(),x2Mat.rows()*x2Mat.cols()));
                          Eigen::MatrixXf boxes(y1Mat.rows()*y1Mat.cols(),4);//注意这里的boxes不是python代码里面对应的boxes
                          boxes.col(0)=y1MatFlat;
                          boxes.col(1)=x1MatFlat;
                          boxes.col(2)=y2MatFlat;
                          boxes.col(3)=x2MatFlat;
                          //到此已经完成单独一个RPN_ANCHOR_SCALES[i]尺度对应的boxes了
                          //把它放进容器里
                          
                          boxesVec.push_back(boxes);
                          finalBoxesRows+=boxes.rows();//统计五个RPN_ANCHOR_SCALES尺度对应的所有boxes的行数
                          //break;
                      }
                      //以上一步得到的boxes的finalBoxesRows为行数,4为列数创建二维矩阵finalBox(对应python代码的boxes),
                      //其实就是用上面所有的boxes构建形式如[(y1, x1, y2, x2),...,...]的矩阵
                      finalBox=Eigen::MatrixXf (finalBoxesRows,4);
                      //Eigen::VectorXf a(3);
                      //Eigen::VectorXf b(4);
                      //Eigen::VectorXf c(7);
                      //取出boxesVec容器里面每个boxes构建最终的finalBox矩阵(对应boxes)
                      //至此完成了boxes(python代码中)的构建
                      int beginX=0;
                      for(int i=0;i<boxesVec.size();i++)
                      {
                           //mat1.block<rows,cols>(i,j)
                          //矩阵块赋值
                          finalBox.block(beginX,0,boxesVec[i].rows(),boxesVec[i].cols())=boxesVec[i];
                          beginX+=boxesVec[i].rows();
                          //tensorflow::Tensor matTensor(tensorflow::DT_FLOAT,{boxesVec[i].rows(),boxesVec[i].cols()});
                      }

            • 第三步,进行归一化
                  # Normalize coordinates
                    _anchor_cache[tuple(image_shape)] = utils.norm_boxes(a, image_shape[:2])
                  return _anchor_cache[tuple(image_shape)]
              #python代码
              '''
              大概作用就是把上一步得到的boxes(形状如[(y1,x1,y2,x2),...,...])先进行对应位置元素减去偏移量[0,0,1,1]后再除以[h - 1, w - 1, h - 1, w - 1]
              '''
              def norm_boxes(boxes, shape):
                  """Converts boxes from pixel coordinates to normalized coordinates.
                  boxes: [N, (y1, x1, y2, x2)] in pixel coordinates
                  shape: [..., (height, width)] in pixels
                  Note: In pixel coordinates (y2, x2) is outside the box. But in normalized
                  coordinates it's inside the box.
                  Returns:
                      [N, (y1, x1, y2, x2)] in normalized coordinates
                  """
                  h, w = shape
                  scale = np.array([h - 1, w - 1, h - 1, w - 1])
                  shift = np.array([0, 0, 1, 1])
                  return np.divide((boxes - shift), scale).astype(np.float32)

              c++代码如下
               

              /*get normalization finalbox
                      归一化finalBox
                      python代码如下:
                      scale = np.array([h - 1, w - 1, h - 1, w - 1])
                      shift = np.array([0, 0, 1, 1])
                      return np.divide((boxes - shift), scale).astype(np.float32)
                      */
              
                      //先创建scale,shift两个向量
                      Eigen::MatrixXf scaleMat_1r(1,finalBox.cols());
                      Eigen::MatrixXf shiftMat_1r(1,finalBox.cols());
                      scaleMat_1r<<float(inputImg_h-1),float(inputImg_w-1),float(inputImg_h-1),float(inputImg_w-1);
                      shiftMat_1r<<0.f,0.f,1.f,1.f;
                      //因为上一步得到是scaleMat_1r,shiftMat_1r是向量,接下来创建对应的矩阵,该矩阵与finalBox有相同的
                      //形状
                      Eigen::MatrixXf scaleMat=scaleMat_1r.colwise().replicate(finalBox.rows());//通过重复与finalBox同样的行数构建scaleMat
                      Eigen::MatrixXf shiftMat=shiftMat_1r.colwise().replicate(finalBox.rows());//同上
                      Eigen::MatrixXf tmpMat=finalBox-shiftMat;//finalBox对应位置元素减去偏移量
                      finalBox_norm=tmpMat.cwiseQuotient(scaleMat);//finalBox对应位置元素处以scale
                      //至此完成了python代码中的boxes(finalBox_norm),下一步把finalBox_norm矩阵弄成Eigen::tensor类型的inputAnchorsTensor_temp
                      //再通过inputAnchorsTensor_temp填充到tensorflow::tensor类型的inputAnchorsTensor构建最后送入模型的anchor boxes
              
              
                      inputAnchorsTensor=tensorflow::Tensor(tensorflow::DT_FLOAT,{batch_size,finalBox_norm.rows(),finalBox_norm.cols()});//初始化inputAnchorsTensor
                      //float *p=inputAnchorsTensor.flat<float>().data();
                      //通finalBox_norm矩阵构建Eigen::tensor类型的inputAnchorsTensor_temp
                      Eigen::Tensor<float,3>inputAnchorsTensor_temp(1,finalBox_norm.rows(),finalBox_norm.cols());
                      for(int i=0;i<finalBox_norm.rows();i++){
              
                          Eigen::Tensor<float,1>eachrow(finalBox_norm.cols());//用于临时存储finalBox_norm矩阵的的每一行
                          //把finalBox_norm矩阵的一行放进eachrow
                          eachrow.setValues({finalBox_norm.row(i)[0],finalBox_norm.row(i)[1],finalBox_norm.row(i)[2],finalBox_norm.row(i)[3]});
                          //把eachrow放进inputAnchorsTensor_temp的每一行
                          inputAnchorsTensor_temp.chip(i,1)=eachrow;
                      }
                      //把inputAnchorsTensor_temp赋值给inputAnchorsTensor,注意它们两个的类型是不同的
                      auto showMap=inputAnchorsTensor.tensor<float,3>();
                      for(int b=0;b<showMap.dimension(0);b++)
                      {
                          for(int r=0;r<showMap.dimension(1);r++)
                          {
                              for(int c=0;c<showMap.dimension(2);c++)
                              {
                                  
                                  showMap(b,r,c)=inputAnchorsTensor_temp(0,r,c);//这里为0是因为
                                  //我的batch里面的图片都是同样尺寸的,所以它们最终的anchor boxes都是一样,
                                  //只要赋值一个就行了,建议batch里面图片尺寸都是一样的,这样好处理
                              }
                          }
                      }

      • 至此我们已经完成所有送进模型的tensor了,那么网络进行前向推理后,得到的结果也是以tensor的形式保存,怎么提取我们想要的结果能,可以先看下parai/Mask_RCNNpython infere_from_pb.py代码
                detections, mrcnn_class, mrcnn_bbox, mrcnn_mask, rois = \
                    sess.run([detectionsT, mrcnn_classT, mrcnn_bboxT, mrcnn_maskT, roisT],
                        feed_dict={img_ph: molded_images, img_meta_ph: image_metas,             
                    img_anchors_ph:image_anchors})
                //上面是模型推理的代码
                //目标的坐标,类型和置信度结果存储在detections,分割结果存储在mrcnn_mask里面
                //我们只要看unmold_detections函数怎么实现就行了
                results = []
                for i, image in enumerate(images):
                    final_rois, final_class_ids, final_scores, final_masks =\
                        unmold_detections(detections[i], mrcnn_mask[i],
                                          image.shape, molded_images[i].shape,
                                          windows[i])
                    results.append({
                        "rois": final_rois,
                        "class_ids": final_class_ids,
                        "scores": final_scores,
                        "masks": final_masks,
                    })

        可以看到里面起主要作用的是unmold_detections(...)这个函数,这个函数在infere_from_pb.py和mrcnn/model.py都有,是一样的。
        接下来看这个函数的python代码:如下
         

        def unmold_detections(detections, mrcnn_mask, original_image_shape, image_shape, window):
            """Reformats the detections of one image from the format of the neural
                network output to a format suitable for use in the rest of the
                application.
        
                detections: [N, (y1, x1, y2, x2, class_id, score)] in normalized coordinates
                这个是网络输出的目标框结果,N是检测到目标的个数,(y1, x1, y2, x2, class_id, score)分别        
                是四个坐标+类别id+分数,形式是N个(y1, x1, y2, x2, class_id, score)构成的数组
        
                mrcnn_mask: [N, height, width, num_classes]
                这个是网络输出的分割结果,
        
                original_image_shape: [H, W, C] Original image shape before resizing
                原始图像尺寸
        
                image_shape: [H, W, C] Shape of the image after resizing and padding
                这个是送入网络的图像的尺寸
        
                window: [y1, x1, y2, x2] Pixel coordinates of box in the image where the real
                        image is excluding the padding.
                这个是显示窗口的尺寸
                
        
                Returns:
                返回目标框+每个目标框对应的类别+每个目标框对应的分数+每个目标对应的分割
                boxes: [N, (y1, x1, y2, x2)] Bounding boxes in pixels
                class_ids: [N] Integer class IDs for each bounding box
                scores: [N] Float probability scores of the class_id
                masks: [height, width, num_instances] Instance masks
        
                """
        
                        
        
        
            # How many detections do we have?
            # Detections array is padded with zeros. Find the first class_id == 0.
            # 获取类别为0的索引,因为0是背景
            zero_ix = np.where(detections[:, 4] == 0)[0]#这里的[0]是因为np.where的操作结果是放在元组    
            的第一个元素里面的
        
            获取第一个为0的元素索引
            N = zero_ix[0] if zero_ix.shape[0] > 0 else detections.shape[0]
        
        
        
            #提取第N个元素前的所有元素,也就是那些类别不为0的检测框,这一步可能是网络输出时,detections数
            #组是按类别从大到0排列的,所以当取得第一个为0的元素的索引,该索引前面都是类别非0的检测框,
            #不知道是不是这样,这一步没有深究,c++代码里面可以直接遍历detections数组每个元素判断类别是否
            #为0来取舍
            # Extract boxes, class_ids, scores, and class-specific masks
            boxes = detections[:N, :4]
            class_ids = detections[:N, 4].astype(np.int32)
            scores = detections[:N, 5]
            masks = mrcnn_mask[np.arange(N), :, :, class_ids]
        
        
            #进行归一化,因为我是裁剪好图片送进网络的,所以这一步可以不做
            # Translate normalized coordinates in the resized image to pixel
            # coordinates in the original image before resizing
            window = utils.norm_boxes(window, image_shape[:2])
            wy1, wx1, wy2, wx2 = window
            shift = np.array([wy1, wx1, wy1, wx1])
            wh = wy2 - wy1  # window height
            ww = wx2 - wx1  # window width
            scale = np.array([wh, ww, wh, ww])
            # Convert boxes to normalized coordinates on the window
            boxes = np.divide(boxes - shift, scale)
            # Convert boxes to pixel coordinates on the original image
            boxes = utils.denorm_boxes(boxes, original_image_shape[:2])
        
            # Filter out detections with zero area. Happens in early training when
            # network weights are still random
            #从boxes里面找出宽高小于0的索引
            exclude_ix = np.where(
                (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1]) <= 0)[0]
            
            #如果宽高小于0的索引个数不为0,就从boxe里面删除这些宽高小于0的索引
            if exclude_ix.shape[0] > 0:
                boxes = np.delete(boxes, exclude_ix, axis=0)
                class_ids = np.delete(class_ids, exclude_ix, axis=0)
                scores = np.delete(scores, exclude_ix, axis=0)
                masks = np.delete(masks, exclude_ix, axis=0)
                N = class_ids.shape[0]
            
            #
        
            #经过上一步的处理已经获取到类别不为0(不是背景)且尺寸不小于0的索引
            #下一步计算这些索引对应的mask,因为我只要目标框,所以c++代码中我没有计算mask,想计算mask
            #可以根据下面的python代码用c++实现相同的效果就行了
            # Resize masks to original image size and set boundary threshold.
            full_masks = []
            for i in range(N):
                # Convert neural network mask to full size mask
                full_mask = utils.unmold_mask(masks[i], boxes[i], original_image_shape)
                full_masks.append(full_mask)
            full_masks = np.stack(full_masks, axis=-1)\
                if full_masks else np.empty(masks.shape[1:3] + (0,))
        
            return boxes, class_ids, scores, full_masks

        里面两个主要函数是utils.norm_boxes和utils.denorm_boxes ,如下:
         

        def norm_boxes(boxes, shape):
            """Converts boxes from pixel coordinates to normalized coordinates.
            boxes: [N, (y1, x1, y2, x2)] in pixel coordinates #boxes数据格式
            shape: [..., (height, width)] in pixels #图像尺寸,我这边是固定好的
        
            Note: In pixel coordinates (y2, x2) is outside the box. But in normalized
            coordinates it's inside the box.
        
            Returns:
                [N, (y1, x1, y2, x2)] in normalized coordinates
            """
            h, w = shape
            scale = np.array([h - 1, w - 1, h - 1, w - 1])
            shift = np.array([0, 0, 1, 1])
            return np.divide((boxes - shift), scale).astype(np.float32)
        
        
        def denorm_boxes(boxes, shape):
            """Converts boxes from normalized coordinates to pixel coordinates.
            boxes: [N, (y1, x1, y2, x2)] in normalized coordinates #boxes数据格式
            shape: [..., (height, width)] in pixels  #图像尺寸,我这边是固定好的
        
            Note: In pixel coordinates (y2, x2) is outside the box. But in normalized
            coordinates it's inside the box.
        
            Returns:
                [N, (y1, x1, y2, x2)] in pixel coordinates
            """
            h, w = shape
            scale = np.array([h - 1, w - 1, h - 1, w - 1])
            shift = np.array([0, 0, 1, 1])
            return np.around(np.multiply(boxes, scale) + shift).astype(np.int32)


        这一部分c++代码:
         

        struct boxInfo{
            int y1,x1,y2,x2;
            int classId=0;
            float scores=0.f;
            int boxNum=-1;
        };
        
        struct imageDetectInfo{
            int imageWidth=0;//not yet
            int imageHeight=0;//not yet
            int imageNum=-1;
            std::vector<boxInfo> detectInfo;
        
        };
        
        //std::vector<tensorflow::Tensor>&output_tensors 网络输出最终的结果
        //std::vector<imageDetectInfo> &output_vec 用于存储最终的结果,这是本人的格式,可以修改为你自己想要的格式
        void detectBatch::unmold_detections(std::vector<tensorflow::Tensor>&output_tensors,
        std::vector<imageDetectInfo> &output_vec)
        {
        
            //获取网络输出的检测框结果,为output_tensors容器的第一个元素
            tensorflow::Tensor &detections_tensor=output_tensors[0];
        
            //获取detections_tensor的eigen型tensor,boxes_tensor和detections_tensor是指向同个内存的
        
            auto  boxes_tensor=detections_tensor.tensor<float,3>();
            //Extract boxes, class_ids, scores, and class-specific masks
            //whose classId in not 0 ,because 0 is background
            //std::cout<<"resized_tensor is "<<resized_tensor.shape()<<std::endl;
            //std::cout<<"inputAnchorsTensor is "<<inputAnchorsTensor.shape()<<std::endl;
            //std::cout<<"inputMetadataTensor is "<<inputMetadataTensor.shape()<<std::endl;
            //std::cout<<"detections_tensor is "<<detections_tensor.shape()<<std::endl;
        
            //boxes_tensor和detections_tensor的格式是[N, (y1, x1, y2, x2, class_id, score)] 
            //遍历检测框,首先boxes_tensor的第一个维度是图片数
            //遍历一个batch
            for(int imgNum=0;imgNum<boxes_tensor.dimension(0);imgNum++)
            {
                std::vector<Eigen::RowVectorXf> noZeroRow;//声明一个行向量容器存放非背景检测框的数据
        
                //struct imageDetectInfo imageDetectInfotmp ;
                //遍历第二个维度,即每张图片的检测框的数目
                for(int boxNum=0;boxNum<boxes_tensor.dimension(1);boxNum++)
                {
                    
                    //判断类别是否大于0,大于0表示该检测框不是背景
                   if (boxes_tensor(imgNum,boxNum,4)>0)
                   {
                        
                       //把检测框的数据放进行向量
                       Eigen::RowVectorXf eachrow(boxes_tensor.dimension(2));
                       eachrow<<boxes_tensor(imgNum,boxNum,0),
                               boxes_tensor(imgNum,boxNum,1),
                               boxes_tensor(imgNum,boxNum,2),
                               boxes_tensor(imgNum,boxNum,3),
                               boxes_tensor(imgNum,boxNum,4),
                               boxes_tensor(imgNum,boxNum,5);
                        noZeroRow.push_back(eachrow);
        
                   }
        
                }
                
                //创建一个noZeroMat矩阵来存放上面的noZeroRow向量
                Eigen::MatrixXf noZeroMat (noZeroRow.size(),6);
                for(int r=0;r<noZeroRow.size();r++)
                {
                    noZeroMat.row(r)=noZeroRow[r];
        
                }
        
                //上面的noZeroMat矩阵存的不是最终的检测框数据,还要逆归一化,裁剪等操作
                Eigen::MatrixXf boxMat(noZeroMat.rows(),4);
                Eigen::MatrixXf classSoresMat(noZeroMat.rows(),2);
                    
                //noZeroMat的格式是[(y1, x1, y2, x2, class_id, score),...,...] 
                //获取noZeroMat的(y1, x1, y2, x2)部分, 
                boxMat.block(0,0,boxMat.rows(),4)=noZeroMat.block(0,0,noZeroMat.rows(),4);
        
                //获取noZeroMat的(class_id, score)部分,             classSoresMat.block(0,0,classSoresMat.rows(),2)=noZeroMat.block(0,4,classSoresMat.rows(),2);
                //std::cout<<"noZeroMat "<<noZeroMat<<std::endl;
                //std::cout<<"boxMat "<<boxMat<<std::endl;
        
                
                //get the window in image meta
                //获取图像的meta数据,这个是之前计算好了的
                auto metaTensor=inputMetadataTensor.tensor<float,2>();
        
                '''这一部分是模拟 python的norm_boxes(boxes, shape):
                //根据显示窗口来获取对应的windowMat和scale_rMat缩放矩阵,这一部分操作其实可以不用,因为图
                //像都是裁剪好的,而且我也不用在窗口上显示
                Eigen::MatrixXf windowMat(1,4);
                Eigen::MatrixXf scale_rMat(1,4);
                windowMat<<metaTensor(0,7),metaTensor(0,8),
                        metaTensor(0,7),metaTensor(0,8);
                scale_rMat<<metaTensor(0,9)-metaTensor(0,7),
                            metaTensor(0,10)-metaTensor(0,8),
                            metaTensor(0,9)-metaTensor(0,7),
                            metaTensor(0,10)-metaTensor(0,8);
                //get shiftmat
                //boxMat=tmpMat.cwiseQuotient(scaleMat);//that is unnecessary
                //because in my case ,shiftmat is [0,0,0,0],scale is [1,1,1,1]
        
        
        
        
                //逆归一化
                //denorm_boxes //模拟 python的denorm_boxes(boxes, shape)
                Eigen::MatrixXf shiftNorm_rMat(1,4);//空的偏移矩阵
                Eigen::MatrixXf scaleNorm_rMat(1,4);//空的尺度缩放矩阵
                shiftNorm_rMat<<0,0,1,1;//矩阵赋值,其实shiftNorm_rMat现在是一个行向量
                scaleNorm_rMat<<metaTensor(0,1)-1,//同上是一个行向量
                            metaTensor(0,2)-1,
                            metaTensor(0,1)-1,
                            metaTensor(0,2)-1;
                //在列方向对shiftNorm_rMat和scaleNorm_rMat向量进行复制,行数与boxMat相同
                Eigen::MatrixXf shiftNormMat=shiftNorm_rMat.colwise().replicate(boxMat.rows());
                Eigen::MatrixXf scaleNormMat=scaleNorm_rMat.colwise().replicate(boxMat.rows());
                boxMat=boxMat.cwiseProduct(scaleNormMat);//矩阵点乘
                boxMat=boxMat+shiftNormMat;
                finalboxMat=boxMat;//返回最终的boxMat
        
        
        
        
                //std::cout<<"final box mat is "<<finalboxMat<<std::endl;
                //将最终的数据放进自己的构建的结构化数据里面,可以根据的自己的需要修改
                struct imageDetectInfo imageDetectInfoTmp;
                for(int i=0;i<finalboxMat.rows();i++)
                {
                    struct boxInfo boxInfoTmp;
                    boxInfoTmp.y1=(int)(finalboxMat(i,0));
                    boxInfoTmp.x1=(int)(finalboxMat(i,1));
                    boxInfoTmp.y2=(int)(finalboxMat(i,2));
                    boxInfoTmp.x2=(int)(finalboxMat(i,3));
                    boxInfoTmp.classId=(int)(classSoresMat(i,0));
                    boxInfoTmp.scores=classSoresMat(i,1);
                    boxInfoTmp.boxNum=i;
                    imageDetectInfoTmp.detectInfo.push_back(boxInfoTmp);
                }
                imageDetectInfoTmp.imageNum=imgNum;
                output_vec[imgNum]=imageDetectInfoTmp;
        
                //outputsInfo.push_back(imageDetectInfoTmp);
        
            }
        }

        至此就完成所有的操作了,放上一波检测的结果(界面为qt做的),模型不是训练得不太完善,主要是
        数据增强和网络参数还没完善

                      至此整个流程算是走完了,       
 想下载源码的兄弟可以下载这个链接:
没有积分的话可以联系微信号anuntilforever1314 ,加好友时请备注行业+姓名以便交流 -。-与大家一起交流学习。

另外车辆,车牌,反光衣,安全帽等数据集,链接,有兴趣的朋友可以看下

链接:https://pan.baidu.com/s/1mG7X71rngtWqP2tsfFm26A 提取码:5555

代码写的稀烂,论文也不是读得很仔细,如果有大佬们发现有问题的地方请不吝赐教。

总结:虽然mask rcnn是几年前的东西,最近也有好多目标检测,分割的新论文出现,但实践中在多个数据集中mask rcnn效果还是可以的,个人觉得原因之一就是其中密集的anchor,不过未来应该是anchor free的,下一步进行anchor free的研究....

下一步工作: ssd pytorch 转 torchscript 再用 libtorch c++ 调用

GitHub - CasonTsai/MaskRcnn_tensorflow_cpp_inference: inference mask_rcnn model with tensorflow c++ api

补上qt.pro文件

#-------------------------------------------------
#
# Project created by QtCreator 2018-12-18T13:01:09
#
#-------------------------------------------------

QT       += core gui

greaterThan(QT_MAJOR_VERSION, 4): QT += widgets

TARGET = codeShow
TEMPLATE = app

qtHaveModule(opengl): QT += opengl

# The following define makes your compiler emit warnings if you use
# any feature of Qt which has been marked as deprecated (the exact warnings
# depend on your compiler). Please consult the documentation of the
# deprecated API in order to know how to port your code away from it.
#DEFINES += QT_DEPRECATED_WARNINGS

# You can also make your code fail to compile if you use deprecated APIs.
# In order to do so, uncomment the following line.
# You can also select to disable deprecated APIs only up to a certain version of Qt.
#DEFINES += QT_DISABLE_DEPRECATED_BEFORE=0x060000    # disables all the APIs deprecated before Qt 6.0.0
DEFINES +=   COMPILER_MSVC NOMINMAX COMPILER_MSVC QT_DEPRECATED_WARNINGS
CONFIG += c++11 thread

SOURCES += \
        main.cpp \
        mainwindow.cpp \
        detectbatch.cpp



HEADERS += \
        mainwindow.h \
        data_format.h \
        detectbatch.h


FORMS += \
        mainwindow.ui


# Default rules for deployment.
#qnx: target.path = /tmp/$${TARGET}/bin
#else: unix:!android: target.path = /opt/$${TARGET}/bin
#!isEmpty(target.path): INSTALLS += target


##cuda
#CUDA_DIR = "E:\thirdParty_lib\cuda\install"                # Path to cuda toolkit install
#SYSTEM_NAME = x64                 # Depending on your system either 'Win32', 'x64', or 'Win64'
#SYSTEM_TYPE = 64                    # '32' or '64', depending on your system
#CUDA_ARCH = compute_61                 # Type of CUDA architecture
#CUDA_CODE = sm_61
#NVCC_OPTIONS = --use_fast_math
## include paths
#INCLUDEPATH += "$$CUDA_DIR/include" \
#"D:\software\cuda_install\common\inc"
## library directories
#QMAKE_LIBDIR += "$$CUDA_DIR/lib/x64"
## The following makes sure all path names (which often include spaces) are put between quotation marks
#CUDA_INC = $$join(INCLUDEPATH,'" -I"','-I"','"')
## Add the necessary libraries
#CUDA_LIB_NAMES += \
#cuda \
#cudart \
#MSVCRT
##CUDA_LIB_NAMES += \
##cublas \
##cublas_device \
##cuda \
##cudadevrt \
##cudart \
##cudart_static \
##cufft \
##cufftw \
##curand \
##cusolver \
##cusparse \
##nppc \
##nppial \
##nppicc \
##nppicom \
##nppidei \
##nppif \
##nppig \
##nppim \
##nppist \
##nppisu \
##nppitc \
##npps \
##nvblas \
##nvcuvid \
##nvgraph \
##nvml \
##nvrtc \
##OpenCL \
##kernel32 \
##user32 \
##gdi32 \
##winspool \
##comdlg32 \
##advapi32 \
##shell32 \
##ole32 \
##oleaut32 \
##uuid \
##odbc32 \
##odbccp32 \
##ucrt \
##MSVCRT
#for(lib, CUDA_LIB_NAMES) {
#    CUDA_LIBS += $$lib.lib
#}
#for(lib, CUDA_LIB_NAMES) {
#    NVCC_LIBS += -l$$lib
#}
#LIBS += $$NVCC_LIBS
## The following library conflicts with something in Cuda
#QMAKE_LFLAGS_RELEASE = /NODEFAULTLIB:msvcrt.lib
#QMAKE_LFLAGS_DEBUG   = /NODEFAULTLIB:msvcrtd.lib
## MSVCRT link option (static or dynamic, it must be the same with your Qt SDK link option)
#MSVCRT_LINK_FLAG_DEBUG   = "/MDd"
#MSVCRT_LINK_FLAG_RELEASE = "/MD"
##MSVCRT_LINK_FLAG_DEBUG   = "/MTd"
##MSVCRT_LINK_FLAG_RELEASE = "/MT"
## Configuration of the Cuda compiler
#CONFIG(debug, debug|release) {
#    # Debug mode
#    DESTDIR = debug
#    OBJECTS_DIR = debug/obj
#    CUDA_OBJECTS_DIR = debug/cuda
#    cuda_d.input = CUDA_SOURCES
#    cuda_d.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
#    cuda_d.commands = $$CUDA_DIR/bin/nvcc.exe -D_DEBUG $$NVCC_OPTIONS $$CUDA_INC $$LIBS \
#                      --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -code=$$CUDA_CODE \
#                      --compile -cudart static -g -DWIN32 -D_MBCS \
#                      -Xcompiler "/wd4819,/EHsc,/W3,/nologo,/Od,/Zi,/RTC1" \
#                      -Xcompiler $$MSVCRT_LINK_FLAG_DEBUG \
#                      -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
#    cuda_d.dependency_type = TYPE_C
#    QMAKE_EXTRA_COMPILERS += cuda_d
#}
#else {
#    # Release mode
#    DESTDIR = release
#    OBJECTS_DIR = release/obj
#    CUDA_OBJECTS_DIR = release/cuda
#    cuda.input = CUDA_SOURCES
#    cuda.output = $$CUDA_OBJECTS_DIR/${QMAKE_FILE_BASE}_cuda.o
#    cuda.commands = $$CUDA_DIR/bin/nvcc.exe $$NVCC_OPTIONS $$CUDA_INC $$LIBS \
#                    --machine $$SYSTEM_TYPE -arch=$$CUDA_ARCH -code=$$CUDA_CODE \
#                    --compile -cudart static -D_MBCS \
#                    -Xcompiler "/wd4819,/EHsc,/W3,/nologo,/O2,/Zi" \
#                    -Xcompiler $$MSVCRT_LINK_FLAG_RELEASE \
#                    -c -o ${QMAKE_FILE_OUT} ${QMAKE_FILE_NAME}
#    cuda.dependency_type = TYPE_C
#    QMAKE_EXTRA_COMPILERS += cuda
#}


LIBS += -LE:\Maidipu\code\tensorflow_1_8_gpu\lib -ltensorflow




INCLUDEPATH +=D:\Code-software\opencv_3_3_0\build\install\include \
              D:\Code-software\opencv_3_3_0\build\install\include\opencv2\
              D:\Code-software\opencv_3_3_0\build\install\include\opencv \
              E:\Maidipu\code\tensorflow_1_8_gpu\include


LIBS += -LD:\Code-software\opencv_3_3_0\build\install\x64\vc12\lib -lopencv_core330 -lopencv_imgproc330 \
        -lopencv_imgcodecs330 -lopencv_highgui330

评论 24
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值