opencv DNN模块之使用GoogleNet Caffe模型实现图像分类

最新推荐文章于 2023-12-05 11:25:18 发布

年纪青青

最新推荐文章于 2023-12-05 11:25:18 发布

阅读量781

点赞数 1

分类专栏： opencv 文章标签：计算机视觉 opencv

本文链接：https://blog.csdn.net/z961968549/article/details/104234712

版权

opencv 专栏收录该内容

63 篇文章 10 订阅

订阅专栏

原理

自行百度，本人不擅长

每种网络模型对应的输入数据

可以查看每种模型的模型二进制文件名称，网络描述文件名称，中至化处理参数，数据样本尺寸，描述label文件名称，rgb通道顺序以及典型应用场景等信息
链接地址:https://github.com/opencv/opencv/blob/master/samples/dnn/models.yml

################################################################################
# Object detection models.
################################################################################

# OpenCV's face detection network
opencv_fd:
  model: "opencv_face_detector.caffemodel"
  config: "opencv_face_detector.prototxt"
  mean: [104, 177, 123]
  scale: 1.0
  width: 300
  height: 300
  rgb: false
  sample: "object_detection"

# YOLO object detection family from Darknet (https://pjreddie.com/darknet/yolo/)
# Might be used for all YOLOv2, TinyYolov2 and YOLOv3
yolo:
  model: "yolov3.weights"
  config: "yolov3.cfg"
  mean: [0, 0, 0]
  scale: 0.00392
  width: 416
  height: 416
  rgb: true
  classes: "object_detection_classes_yolov3.txt"
  sample: "object_detection"

tiny-yolo-voc:
  model: "tiny-yolo-voc.weights"
  config: "tiny-yolo-voc.cfg"
  mean: [0, 0, 0]
  scale: 0.00392
  width: 416
  height: 416
  rgb: true
  classes: "object_detection_classes_pascal_voc.txt"
  sample: "object_detection"

# Caffe implementation of SSD model from https://github.com/chuanqi305/MobileNet-SSD
ssd_caffe:
  model: "MobileNetSSD_deploy.caffemodel"
  config: "MobileNetSSD_deploy.prototxt"
  mean: [127.5, 127.5, 127.5]
  scale: 0.007843
  width: 300
  height: 300
  rgb: false
  classes: "object_detection_classes_pascal_voc.txt"
  sample: "object_detection"

# TensorFlow implementation of SSD model from https://github.com/tensorflow/models/tree/master/research/object_detection
ssd_tf:
  model: "ssd_mobilenet_v1_coco_2017_11_17.pb"
  config: "ssd_mobilenet_v1_coco_2017_11_17.pbtxt"
  mean: [0, 0, 0]
  scale: 1.0
  width: 300
  height: 300
  rgb: true
  classes: "object_detection_classes_coco.txt"
  sample: "object_detection"

# TensorFlow implementation of Faster-RCNN model from https://github.com/tensorflow/models/tree/master/research/object_detection
faster_rcnn_tf:
  model: "faster_rcnn_inception_v2_coco_2018_01_28.pb"
  config: "faster_rcnn_inception_v2_coco_2018_01_28.pbtxt"
  mean: [0, 0, 0]
  scale: 1.0
  width: 800
  height: 600
  rgb: true
  sample: "object_detection"

################################################################################
# Image classification models.
################################################################################

# SqueezeNet v1.1 from https://github.com/DeepScale/SqueezeNet
squeezenet:
  model: "squeezenet_v1.1.caffemodel"
  config: "squeezenet_v1.1.prototxt"
  mean: [0, 0, 0]
  scale: 1.0
  width: 227
  height: 227
  rgb: false
  classes: "classification_classes_ILSVRC2012.txt"
  sample: "classification"

# Googlenet from https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet
googlenet:
  model: "bvlc_googlenet.caffemodel"
  config: "bvlc_googlenet.prototxt"
  mean: [104, 117, 123]
  scale: 1.0
  width: 224
  height: 224
  rgb: false
  classes: "classification_classes_ILSVRC2012.txt"
  sample: "classification"

################################################################################
# Semantic segmentation models.
################################################################################

# ENet road scene segmentation network from https://github.com/e-lab/ENet-training
# Works fine for different input sizes.
enet:
  model: "Enet-model-best.net"
  mean: [0, 0, 0]
  scale: 0.00392
  width: 512
  height: 256
  rgb: true
  classes: "enet-classes.txt"
  sample: "segmentation"

fcn8s:
  model: "fcn8s-heavy-pascal.caffemodel"
  config: "fcn8s-heavy-pascal.prototxt"
  mean: [0, 0, 0]
  scale: 1.0
  width: 500
  height: 500
  rgb: false
  sample: "segmentation"

网络输入输出

输入层 [NxCxHxW]通道顺序：RGB or BGR
输出层(softmax层-名称是prob) N11000 ,单个图像，一行，1000列，每列对应识别1000个分类对应的置信程度，置信读最高的对应的类别可能性最大

相关函数api

创建并载入神经网络Net函数api

Net cv::dnn::readNetFromCaffe ( const String &prototxt,
		                        const String &caffeModel = String())

参数说明

prototxt： prototxt文件的prototxt路径，带有网络体系结构的文本描述。
caffeModel：具有学习网络的.caffemodel文件的caffeModel路径。
返回值：Net 对象

普通image图片转网络识别用blod格式函数api

Mat cv::dnn::blobFromImage 	( 	InputArray  	image,
		                        double  	scalefactor = 1.0,
		                        const Size &  	size = Size(),
		                        const Scalar &  	mean = Scalar(),
		                        bool  	swapRB = false,
		                        bool  	crop = false,
		                        int  	ddepth = CV_32F 
	)

参数说明

image：输入图像（具有1、3或4通道）。
scalefactor：尺寸输出图像的空间的缩放尺寸
Size：平均标量，其平均值从通道中减去。
Scalar：如果图像具有BGR排序并且swapRB为true，则值应按（平均值R，平均值G，平均值B）顺序排列，图像值的比例因子乘数。
swapRB：标志，指示需要交换3通道图像中的第一个和最后一个通道。
crop：裁切标记，指示是否在调整大小后裁切图像
ddepth：输出Blob的深度。选择CV_32F或CV_8U。
返回值:Mat

网络设置输入层，输入层数据函数api

void cv::dnn::Net::setInput 	( 	InputArray  	blob,
		                            const String &  	name = "",
		                            double  	scalefactor = 1.0,
		                            const Scalar &  	mean = Scalar() 
	)

参数说明

blob:输入数据一个新的Blob。应该具有CV_32F或CV_8U深度。
name:输入层的名称。
scalefactor:可选的标准化比例。
mean:可选的平均减法值。

函数api

Mat cv::dnn::Net::forward 	(const String &outputName = String())

参数说明

outputName:需要获取输出的层的名称
返回值：指定层的第一个输出的Blob。默认情况下，整个网络都运行正向传递。

函数api

void cv::minMaxLoc 	( 	InputArray  	src,
		double *  	minVal,
		double *  	maxVal = 0,
		Point *  	minLoc = 0,
		Point *  	maxLoc = 0,
		InputArray  	mask = noArray() 
	)

参数说明

src:输入单通道数组。
minVal:指向返回的最小值的指针；如果不需要，则使用NULL。
maxVal:指向返回的最大值的指针；如果不需要，则使用NULL。
minLoc:指向返回的最小位置的指针（在2D情况下）；如果不需要，则使用NULL。
maxLoc:指向返回的最大位置的指针（在2D情况下）；如果不需要，则使用NULL。
mask:用于选择子阵列的可选遮罩。

代码

#include <opencv2/opencv.hpp>
#include <opencv2/dnn.hpp>
#include <iostream>

using namespace std;
using namespace cv;
using namespace cv::dnn;

#define PIC_PATH "/work/opencv_pic/"
#define PIC_NAME "airplain.jpeg"

string model_bin_file = "/work/opencv_dnn/bvlc_googlenet.caffemodel";
string model_txt_file = "/work/opencv_dnn/bvlc_googlenet.prototxt";
string labels_txt_file = "/work/opencv_dnn/synset_words.txt";
vector<string> readLabels(void);
int main(void)
{
    string pic = string(PIC_PATH)+string(PIC_NAME);
    Mat src;
    src = imread(pic);
    if(src.empty())
    {
        printf("pic read err\n");
        return -1;
    }

    namedWindow("input image",WINDOW_AUTOSIZE);

    vector<string> labels = readLabels();

    //创建并载入神经网络
    Net net = readNetFromCaffe(model_txt_file,model_bin_file);
    if(net.empty())
    {
        printf("read caffe model data err\n");
        return -1;
    }

    //图像转换 将RGB图像转化为googlenet神经网络识别的blob图像
    Mat inputBlob = blobFromImage(src,1.0,Size(224,224),Scalar(104,117,123)，true);
    Mat prob;
    
    //为网络 设置数据源，将数据输入到数据层
    net.setInput(inputBlob,"data");

    //获取prob 层的数据
    prob = net.forward("prob");

    //将输出数据转化到 1通道 1行 Mat矩阵
    Mat probMat = prob.reshape(1,1);
    Point classNumber;   //最大值坐标
    double classProb;    //最大值对应的概率

    //查找最大值
    minMaxLoc(probMat,NULL,&classProb,NULL,&classNumber);
    int classidx = classNumber.x;   //获取最大值坐标x  方便在labels查找对应的物品分类名称

    cout<<"name:"<<labels.at(classidx)<<endl;
    cout<<"possible:"<< classProb*100<<"%"<<endl;

    putText(src,labels.at(classidx),Point(20,50),FONT_HERSHEY_PLAIN,4.0,
            Scalar(0,0,255),2,8);
    imshow("input image",src);

    waitKey(0);
    destroyAllWindows();
    return 0;
}

//获取每个标签的名称
vector<string> readLabels(void)
{
    vector<string> classNames;
    ifstream fp(labels_txt_file);
    if(!fp.is_open())
    {
        printf("could not open file\n");
        exit(-1);
    }
    string name;
    while(!fp.eof())
    {
        //读取一行数据
        getline(fp,name);
        if(name.length())
        {
            //查找到空格位置 加一为名称
            classNames.push_back(name.substr(name.find(' ')+1));
        }
    }
    fp.close();
    return  classNames;
}

效果

在这里插入图片描述

年纪青青

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
opencv DNN模块之使用GoogleNet Caffe模型实现图像分类

原理自行百度，本人不擅长相关函数api创建并载入神经网络Net函数apiNet cv::dnn::readNetFromCaffe ( const String &prototxt, const String &caffeModel = String()) 参数说明prototxt： prototxt文件的pr...
复制链接

扫一扫

专栏目录