YOLO-V3将检测的目标单独保存

最新推荐文章于 2024-07-22 20:56:06 发布

凌空的桨

最新推荐文章于 2024-07-22 20:56:06 发布

阅读量4.6k

点赞数 3

分类专栏： yolo 目标检测 YOLO目标检测文章标签： YOLO-V3

本文链接：https://blog.csdn.net/baidu_36669549/article/details/84333385

版权

YOLO目标检测同时被 3 个专栏收录

11 篇文章 7 订阅

订阅专栏

yolo

9 篇文章 0 订阅

订阅专栏

目标检测

9 篇文章 0 订阅

订阅专栏

之前写过一篇文章将Yolo-v2检测到的目标单独保存成图像，其实内容差不多少，只是最近总有人问我对于YOLO-V3的怎么改，下面就详细的说一下，在YOLO-V3的源码中修改具有保存子图像的方法。

打开darknet_no_gpu.sln，可以看到很多的代码，其实对我们有用的只有detector.c，image.c,image.h这三个文件。

下面就具体介绍一下怎么改。

在YOLO的源码中，画框的函数通常会命名成draw_detectionsxxx之类的，所以通常都会draw_detections后面可能有别的什么的用来区分，比如在YOLO-V2的源码中就用draw_detections函数，而在YOLO-V3中调用的是draw_detections_v3函数，只是将draw_detections的功能进行了优化，优化了什么我等会再说。先去看看draw_detections_v3函数

void draw_detections_v3(image im, detection *dets, int num, float thresh, char **names, image **alphabet, int classes, int ext_output)
{
    int selected_detections_num;
    detection_with_class* selected_detections = get_actual_detections(dets, num, thresh, &selected_detections_num);
	//added by fyj start
	m_img = copy_image(im);
	//added by fyj end
    // text output
    qsort(selected_detections, selected_detections_num, sizeof(*selected_detections), compare_by_lefts);
    int i;
    for (i = 0; i < selected_detections_num; ++i) {
        const int best_class = selected_detections[i].best_class;
        printf("%s: %.0f%%", names[best_class],    selected_detections[i].det.prob[best_class] * 100);
        if (ext_output)
            printf("\t(left_x: %4.0f   top_y: %4.0f   width: %4.0f   height: %4.0f)\n",
                (selected_detections[i].det.bbox.x - selected_detections[i].det.bbox.w / 2)*im.w,
                (selected_detections[i].det.bbox.y - selected_detections[i].det.bbox.h / 2)*im.h,
                selected_detections[i].det.bbox.w*im.w, selected_detections[i].det.bbox.h*im.h);
        else
            printf("\n");
        int j;
        for (j = 0; j < classes; ++j) {
            if (selected_detections[i].det.prob[j] > thresh && j != best_class) {
                printf("%s: %.0f%%\n", names[j], selected_detections[i].det.prob[j] * 100);
            }
        }
    }

    // image output
    qsort(selected_detections, selected_detections_num, sizeof(*selected_detections), compare_by_probs);
    for (i = 0; i < selected_detections_num; ++i) {
            int width = im.h * .006;
            if (width < 1)
                width = 1;

            /*
            if(0){
            width = pow(prob, 1./2.)*10+1;
            alphabet = 0;
            }
            */

            //printf("%d %s: %.0f%%\n", i, names[selected_detections[i].best_class], prob*100);
            int offset = selected_detections[i].best_class * 123457 % classes;
            float red = get_color(2, offset, classes);
            float green = get_color(1, offset, classes);
            float blue = get_color(0, offset, classes);
            float rgb[3];

            //width = prob*20+2;

            rgb[0] = red;
            rgb[1] = green;
            rgb[2] = blue;
            box b = selected_detections[i].det.bbox;
            //printf("%f %f %f %f\n", b.x, b.y, b.w, b.h);

            int left = (b.x - b.w / 2.)*im.w;
            int right = (b.x + b.w / 2.)*im.w;
            int top = (b.y - b.h / 2.)*im.h;
            int bot = (b.y + b.h / 2.)*im.h;
			//add by fyj start

			//printf("That is it!");

			pre_x = left;
			pre_y = top;
			pre_h = bot - top;
			pre_w = right - left;

			save_cut_image(pre_x, pre_y, pre_h, pre_w, i);
			//added by fyj end

            if (left < 0) left = 0;
            if (right > im.w - 1) right = im.w - 1;
            if (top < 0) top = 0;
            if (bot > im.h - 1) bot = im.h - 1;

            //int b_x_center = (left + right) / 2;
            //int b_y_center = (top + bot) / 2;
            //int b_width = right - left;
            //int b_height = bot - top;
            //sprintf(labelstr, "%d x %d - w: %d, h: %d", b_x_center, b_y_center, b_width, b_height);

            draw_box_width(im, left, top, right, bot, width, red, green, blue);
            if (alphabet) {
                char labelstr[4096] = { 0 };
                strcat(labelstr, names[selected_detections[i].best_class]);
                int j;
                for (j = 0; j < classes; ++j) {
                    if (selected_detections[i].det.prob[j] > thresh && j != selected_detections[i].best_class) {
                        strcat(labelstr, ", ");
                        strcat(labelstr, names[j]);
                    }
                }
                image label = get_label_v3(alphabet, labelstr, (im.h*.03));
                draw_label(im, top + width, left, label, rgb);
                free_image(label);
            }
            if (selected_detections[i].det.mask) {
                image mask = float_to_image(14, 14, 1, selected_detections[i].det.mask);
                image resized_mask = resize_image(mask, b.w*im.w, b.h*im.h);
                image tmask = threshold_image(resized_mask, .5);
                embed_image(tmask, im, left, top);
                free_image(mask);
                free_image(resized_mask);
                free_image(tmask);
            }
    }
    free(selected_detections);
}

这个函数的输入有image im, detection *dets, int num, float thresh, char **names, image **alphabet, int classes, int ext_output这么多，added by fyj xxx 就是我自己加的。

opencv保存子图像需要x,y,h,w这几个参数，因为用的c语言版本的opencv，具体的用法可以参考C语言与C++版本的opencv实现截取图像中的一部分显示这篇文章。

这四个参数明显还不够，还需要一个能区分不同子图像的num，用来保存。这个num可以直接调用for循环里的i，简单吧。然后就是我自己加的save_cut_image函数了。

void save_cut_image(int px, int py, int ph, int pw, int no)
{
	image copy = copy_image(m_img);
	if (m_img.c == 3) rgbgr_image(copy);
	int x, y, k;
	char buff[256];
	//F://darknet-v3//darknet-master//build//darknet//x64//results//%d.jpg
	sprintf(buff, "F://darknet-v3//darknet-master//build//darknet//x64//results//%d.jpg", no);

	IplImage *disp = cvCreateImage(cvSize(m_img.w, m_img.h), IPL_DEPTH_8U, m_img.c);
	int step = disp->widthStep;
	for (y = 0; y < m_img.h; ++y) {
		for (x = 0; x < m_img.w; ++x) {
			for (k = 0; k < m_img.c; ++k) {
				disp->imageData[y*step + x*m_img.c + k] = (unsigned char)(get_pixel(copy, x, y, k) * 255);
			}
		}
	}
	CvMat *pMat = cvCreateMatHeader(m_img.w, m_img.h, IPL_DEPTH_8U);

	//char rect_name[256];

	//sprintf(rect_name, "%d_rect", no);

	CvRect rect = cvRect(px, py, pw, ph);

	cvGetSubRect(disp, pMat, rect);

	IplImage *pSubImg = cvCreateImage(cvSize(pw, ph), IPL_DEPTH_8U, m_img.c);

	cvGetImage(pMat, pSubImg);

	//printf("x=%d,y=%d,h=%d,w=%d\n", px, py, ph, pw);

	cvSaveImage(buff, pSubImg, 0);

	//cvReleaseImage(&disp);
	//cvReleaseImage(&pMat);
	//cvReleaseImage(&rect);

	//memset(&rect, 0, sizeof(rect));
	//cvReleaseImage(&pSubImg);
	//free(&rect);


	free_image(copy);
}

请注意这里我用的是我自己的路径，你们可以改成相对路径"results//%d.jpg"

void save_cut_image(int px, int py, int ph, int pw, int no)
{
	image copy = copy_image(m_img);
	if (m_img.c == 3) rgbgr_image(copy);
	int x, y, k;
	char buff[256];
	//F://darknet-v3//darknet-master//build//darknet//x64//results//%d.jpg
	sprintf(buff, "results//%d.jpg", no);

	IplImage *disp = cvCreateImage(cvSize(m_img.w, m_img.h), IPL_DEPTH_8U, m_img.c);
	int step = disp->widthStep;
	for (y = 0; y < m_img.h; ++y) {
		for (x = 0; x < m_img.w; ++x) {
			for (k = 0; k < m_img.c; ++k) {
				disp->imageData[y*step + x*m_img.c + k] = (unsigned char)(get_pixel(copy, x, y, k) * 255);
			}
		}
	}
	CvMat *pMat = cvCreateMatHeader(m_img.w, m_img.h, IPL_DEPTH_8U);

	//char rect_name[256];

	//sprintf(rect_name, "%d_rect", no);

	CvRect rect = cvRect(px, py, pw, ph);

	cvGetSubRect(disp, pMat, rect);

	IplImage *pSubImg = cvCreateImage(cvSize(pw, ph), IPL_DEPTH_8U, m_img.c);

	cvGetImage(pMat, pSubImg);

	//printf("x=%d,y=%d,h=%d,w=%d\n", px, py, ph, pw);

	cvSaveImage(buff, pSubImg, 0);

	//cvReleaseImage(&disp);
	//cvReleaseImage(&pMat);
	//cvReleaseImage(&rect);

	//memset(&rect, 0, sizeof(rect));
	//cvReleaseImage(&pSubImg);
	//free(&rect);


	free_image(copy);
}

其实都可以的。

还有就是在image.c中声明的全局变量和静态变量

image m_img;
pre_x = 0;
pre_y = 0;
pre_h = 0;
pre_w = 0;

pre_x ，pre_y ，pre_h ，pre_w 需要在image.h中进行声明

static int pre_x, pre_y, pre_h, pre_w;

然后再说一下YOLO-V3版本的draw_detections_v3函数，最大的优点就是，保存的图像名称是有序的1.jpg,2.jpg....

而在YOLO-V2版本的子图像是无序的数字。，代码我将会上传到github上（可以的话给个星星⭐）。

改完以后，点击生成---生成解决方案

然后就会看到生成的darknet_no_gpu.exe，在x64文件夹里，shift+右键，打开Windows PowerShell输入命令

darknet_no_gpu.exe detector test data/coco.data cfg/yolov3.cfg yolov3.weights -i 0 -thresh 0.25 dog.jpg

可以看到结果

训练是正常的，通过命令

darknet_no_gpu.exe detector train data/voc.data cfg/yolov3-voc.cfg

或者使用预训练好的权重文件

darknet_no_gpu.exe detector train data/voc.data cfg/yolov3-voc.cfg darknet53.conv.74

训练的细节在这篇文章里详细说YOLO-V3训练中会遇到的问题。

凌空的桨

关注

3
点赞
踩
32

收藏

觉得还不错? 一键收藏
9
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录