dlib初识 c++代码

最新推荐文章于 2024-08-06 18:08:05 发布

库页

最新推荐文章于 2024-08-06 18:08:05 发布

阅读量3k

点赞数 2

分类专栏：深度学习文章标签： dlib c

本文链接：https://blog.csdn.net/daniaokuye/article/details/81781249

版权

深度学习专栏收录该内容

61 篇文章 1 订阅

订阅专栏

网站 http://dlib.net/ 这是一个提供机器学习的算法库，比如它提供的深度学习的算法包就特小，速度快。

神经网络：

这是我见过的比较简洁的网络模型写法、实现，所有的包编译完不到80M。用于人脸检测的模型参数一共十几kb。比如说人脸检测的模型是这么写的：

template <long num_filters, typename SUBNET> using con5d = dlib::con<num_filters,5,5,2,2,SUBNET>;
template <long num_filters, typename SUBNET> using con5  = dlib::con<num_filters,5,5,1,1,SUBNET>;

template <typename SUBNET> using downsampler  = dlib::relu<dlib::affine<con5d<32, dlib::relu<
                dlib::affine<con5d<32, dlib::relu<dlib::affine<con5d<16,SUBNET>>>>>>>>>;
template <typename SUBNET> using rcon5  = dlib::relu<dlib::affine<con5<45,SUBNET>>>;

using net_type = dlib::loss_mmod<dlib::con<1,9,9,1,1,rcon5<rcon5<rcon5<
        downsampler<dlib::input_rgb_image_pyramid<dlib::pyramid_down<6>>>>>>>>;

里面可直接看得到的有卷积convolution、激活层relu；affine是不是仿射变换，一种归一化层没去细究。这代码可以在其官网的示例中看到（http://dlib.net/dnn_mmod_face_detection_ex.cpp.html），其中示例中列车了网络参数的下载地址。

网络模型加载：

string path_to_detector = "mmod_human_face_detector.dat";
net_type net;
dlib::deserialize(path_to_detector) >> net;

结果的输出和读取：

vector<Rect> faces;  
auto dets = net(cimg_small);
cout<<dets.size();
for (auto det : dets){//type(rt):mmod_rect
   dlib::rectangle rt=det.rect;
   long tl_x, tl_y;
   unsigned long h, w;
   tl_x, tl_yrt.left(), rt.top();
   h ,w = rt.height(),rt.width();
   cout<<"rt ->left():"<<rt.left();
   faces.push_back(Rect(tl_x,tl_y,w,h));
}

输出的dets类型，可以通过查看dlib/dnn/loss来查看，类型vector<mmod_rect>；而相应的mmod_rect是在dlib/image_processing/full_object_detection.h:132: struct mmod_rect。

数据格式的转换：

dlib::cv_image<dlib::bgr_pixel> cimg_small(image);

这个可以参考这个链接。相关的文件在dlib/opencv/cv_image.h中，也是因为这，dlib和cv的namespace不能同时使用。

或者下面这个链接的做法更直接一点，链接

matrix<rgb_pixel> img;
cv::Mat image = cv::imread(path);
array2d< bgr_pixel> arrimg(image.rows, image.cols);
dlib::assign_image(img, cv_image<rgb_pixel>(image));

更多的转换方法，参考这里：Dlib格式与Opencv之间的转化

更多说明：一份好的教程能说明一切，dlib有一份非常好的教程，在tools/python/src/cnn_face_detector.cpp:18:class cnn_face_detection_model_v1中，里面有一整套的处理方式

编译:

undefined reference to `dlib::tt::add(float, dlib::tensor&, float, dlib::tensor const&)'
//在dlib/cudnn/tensor_tool下面没有找到add。在http://dlib.net/最新版的里面却又相应的文件。所以重新编译安装。
//其中，搜寻namesapce tt可以定位到上面说的文件

编译时，需要用到cmake，在dlib有个例子，https://github.com/davisking/dlib#compiling-your-own-c-programs-that-use-dlib。里面有用的只有三行。

add_subdirectory(../dlib dlib_build)
#重点是这儿，第一个是头文件的位置，比如相对于dlib/example下的cmake来说，其像对位置是父文件夹下的dlib，所以在自己工程里面写上绝对路径；
#第二个文件是在build下的编译文件，比如在dlib中的文件路径是example/build/dlib_build


add_executable(assignment_learning_ex assignment_learning_ex.cpp)
target_link_libraries(assignment_learning_ex dlib::dlib)
#这俩是常规的编译文件，不过注意target那个第二个变量

一份Demo：

//tools/python/src/cnn_face_detector.cpp:18:class cnn_face_detection_model_v1
vector<Rect> detect (Mat img, const int upsample_num_times ){
    vector<Rect> faces;
    dlib::matrix<dlib::rgb_pixel>  image;
    dlib::assign_image(image, dlib::cv_image<dlib::rgb_pixel>(img));

    //matrix<dlib::rgb_pixel> ;
    //dlib::load_image(cimg_small, inputName);
    string path_to_detector = "mmod_human_face_detector.dat";
    net_type net;
    dlib::deserialize(path_to_detector) >> net; 
    // Upsampling the image will allow us to detect smaller faces but will cause the
    // program to use more RAM and run longer.
    unsigned int levels = upsample_num_times;
    dlib::pyramid_down<2> pyr;
    while (levels > 0)
    {
        levels--;
        pyramid_up(image, pyr);
    }
    
    auto dets = net(image);
    
    cout<<"dets.size()"<<dets.size()<<"img.size():"<<image.size()<<
    "img.nr():"<<image.nr()<<"img.nc():"<<image.nc()<<endl;

    for (auto det : dets){//type(rt):mmod_rect
        dlib::rectangle rt = pyr.rect_down(det.rect, upsample_num_times);
        long tl_x, tl_y;
        unsigned long h, w;
        tl_x = rt.left()>0? rt.left(): 0;
        tl_y = rt.top()>0? rt.top() : 0;
        h = rt.height();
        w = rt.width();
        cout<<"rt ->left():"<<tl_x<<"y "<<tl_y<<"h "<< h <<"w "<< w <<endl;
        faces.push_back(Rect(tl_x,tl_y,w,h));
    }
    return faces;
}