darknet源码剖析（五）

最新推荐文章于 2022-07-29 16:41:01 发布

mazinkaiser1991

最新推荐文章于 2022-07-29 16:41:01 发布

阅读量896

点赞数

分类专栏： darknet源码剖析文章标签： darknet

本文链接：https://blog.csdn.net/u012927281/article/details/83582771

版权

darknet源码剖析专栏收录该内容

14 篇文章

订阅专栏

继续分析load_data_detection

    int i;
    data d = {0};
    d.shallow = 0;

    d.X.rows = n;
    d.X.vals = calloc(d.X.rows, sizeof(float*));
    d.X.cols = h*w*3;

    d.y = make_matrix(n, 5*boxes);

d用于存储训练数据集及其对应标签。rows代表每个线程处理的数据，cols用于存储图像大小，但此处是h×w×3，而不是h×w×channel，是不是darknet仅能处理3通道图像？

d.y用于存储标签，make_matrix位于matrix.c文件中，n代表训练数据数量，5代表5个维度的结果，分别为（label，x1，y1，x2，y2），boxes应该为9，但其代表的具体含义尚不清楚。

    for(i = 0; i < n; ++i){
        image orig = load_image_color(random_paths[i], 0, 0);
        image sized = make_image(w, h, orig.c);
        fill_image(sized, .5);

        float dw = jitter * orig.w;
        float dh = jitter * orig.h;

        float new_ar = (orig.w + rand_uniform(-dw, dw)) / (orig.h + rand_uniform(-dh, dh));
        //float scale = rand_uniform(.25, 2);
        float scale = 1;

        float nw, nh;

        if(new_ar < 1){
            nh = scale * h;
            nw = nh * new_ar;
        } else {
            nw = scale * w;
            nh = nw / new_ar;
        }

        float dx = rand_uniform(0, w - nw);
        float dy = rand_uniform(0, h - nh);

        place_image(orig, nw, nh, dx, dy, sized);

        random_distort_image(sized, hue, saturation, exposure);

        int flip = rand()%2;
        if(flip) flip_image(sized);
        d.X.vals[i] = sized.data;


        fill_truth_detection(random_paths[i], boxes, d.y.vals[i], classes, flip, -dx/w, -dy/h, nw/w, nh/h);

        free_image(orig);
    }

load_image_color位于image.c文件中，用于加载图片，在加载过程中图片已经做过归一化处理。

im.data[dst_index] = (float)data[src_index]/255.;

        image sized = make_image(w, h, orig.c);
        fill_image(sized, .5);

make_image用于构建一副空图片，fill_image用于为空图片填充值0.5。

        float dw = jitter * orig.w;
        float dh = jitter * orig.h;

根据yolov3-voc.cfg配置，width为416，height为416。因此orig.w为416，orig.h为416。jitter为0.3，因此dw、dh为124.8。

float new_ar = (orig.w + rand_uniform(-dw, dw)) / (orig.h + rand_uniform(-dh, dh));

rand_uniform位于util.c文件中，用于生成max与min之间的一个随机数，此处为了分析代码方便，设置第一个产生的随机数为5，第二个产生的随机数为10，因此new_ar为0.9629。

        float scale = 1;

        float nw, nh;

        if(new_ar < 1){
            nh = scale * h;
            nw = nh * new_ar;
        } else {
            nw = scale * w;
            nh = nw / new_ar;
        }

当前new_ar<1，因此nh等于416，nw等于400.56。

        float dx = rand_uniform(0, w - nw);
        float dy = rand_uniform(0, h - nh);

w-nw为15.44，设置dx为9，dy为0。

        place_image(orig, nw, nh, dx, dy, sized);

place_image的作用不是特别清楚，看了一个大概，觉得是将原图的一部分放到sized上。

        random_distort_image(sized, hue, saturation, exposure);

这一句的作用是做随机变换。

        int flip = rand()%2;
        if(flip) flip_image(sized);
        d.X.vals[i] = sized.data;

随机翻转，并将数据赋到d.X中。