darknet源码剖析（六）

最新推荐文章于 2020-08-28 14:33:40 发布

mazinkaiser1991

最新推荐文章于 2020-08-28 14:33:40 发布

阅读量1.2k

点赞数

分类专栏： darknet源码剖析文章标签： darknet

本文链接：https://blog.csdn.net/u012927281/article/details/83589005

版权

darknet源码剖析专栏收录该内容

14 篇文章 16 订阅

订阅专栏

继续分析load_data_detection，进入fill_truth_detection函数。fill_truth_detection的作用是读取图片对应的标注信息。

首先进入find_replace函数。

void find_replace(char *str, char *orig, char *rep, char *output)
{
    char buffer[4096] = {0};
    char *p;

    sprintf(buffer, "%s", str);
    if(!(p = strstr(buffer, orig))){  // Is 'orig' even in 'str'?
        sprintf(output, "%s", str);
        return;
    }

    *p = '\0';

    sprintf(output, "%s%s%s", buffer, rep, p+strlen(orig));
}

该函数的作用是首先查找orig是否存在，若不存在则将str的值直接赋给output并返回；若存在则将orig替换为rep。

    find_replace(path, "images", "labels", labelpath);
    find_replace(labelpath, "JPEGImages", "labels", labelpath);

    find_replace(labelpath, "raw", "labels", labelpath);

从以上代码可以知道，图片可以存放在images、JPEGImages、raw路径下，上述路径会被替换为labels。上述代码同时说明标签文件需要放置在labels文件夹下。

    find_replace(labelpath, ".jpg", ".txt", labelpath);
    find_replace(labelpath, ".png", ".txt", labelpath);
    find_replace(labelpath, ".JPG", ".txt", labelpath);
    find_replace(labelpath, ".JPEG", ".txt", labelpath);

上述代码的功能是将*.jpg/png/JPG/JPEG替换为*.txt，同时表明darknet仅支持jpg与png两种格式。

获取标签路径后，读取标注。

    box_label *boxes = read_boxes(labelpath, &count);
    randomize_boxes(boxes, count);
    correct_boxes(boxes, count, dx, dy, sx, sy, flip);

调用read_boxes读取标注数据，randomize_boxes随机交换标注数据顺序，correct_boxes根据图片调整比例，调整标注框的大小。

read_boxes函数如下：

box_label *read_boxes(char *filename, int *n)
{
    FILE *file = fopen(filename, "r");
    if(!file) file_error(filename);
    float x, y, h, w;
    int id;
    int count = 0;
    int size = 64;
    box_label *boxes = calloc(size, sizeof(box_label));
    while(fscanf(file, "%d %f %f %f %f", &id, &x, &y, &w, &h) == 5){
        if(count == size) {
            size = size * 2;
            boxes = realloc(boxes, size*sizeof(box_label));
        }
        boxes[count].id = id;
        boxes[count].x = x;
        boxes[count].y = y;
        boxes[count].h = h;
        boxes[count].w = w;
        boxes[count].left   = x - w/2;
        boxes[count].right  = x + w/2;
        boxes[count].top    = y - h/2;
        boxes[count].bottom = y + h/2;
        ++count;
    }
    fclose(file);
    *n = count;
    return boxes;
}

通过left、right、top、bottom的计算方法，可以知晓x、y是物体的中心位置。同时x、y、w、h需要注意是归一化后的结果。

这一点可以从产生的voc标注的代码验证，在yolo官网中可以下载该工具，在generate_annotation/voc_label.py中。

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

回到darknet代码中。

    if(count > num_boxes) count = num_boxes;
    float x,y,w,h;
    int id;
    int i;
    int sub = 0;

若标注数量大于num_boxes则将count设为num_boxes，表明会随机的丢弃一些标注框。仅取前num_boxes个框，因此需要调用randomize_boxes函数，调整标注框的顺序。

    for (i = 0; i < count; ++i) {
        x =  boxes[i].x;
        y =  boxes[i].y;
        w =  boxes[i].w;
        h =  boxes[i].h;
        id = boxes[i].id;

        if ((w < .001 || h < .001)) {
            ++sub;
            continue;
        }

        truth[(i-sub)*5+0] = x;
        truth[(i-sub)*5+1] = y;
        truth[(i-sub)*5+2] = w;
        truth[(i-sub)*5+3] = h;
        truth[(i-sub)*5+4] = id;
    }
    free(boxes);

最后将boxes的信息放入truth中，w或h小于0.001，则舍弃该标注框。

mazinkaiser1991

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
darknet源码剖析（六）

继续分析load_data_detection，进入fill_truth_detection函数。fill_truth_detection的作用是读取图片对应的标注信息。首先进入find_replace函数。void find_replace(char *str, char *orig, char *rep, char *output){ char buffer[4096] =...
复制链接

扫一扫

专栏目录