网站 http://dlib.net/ 这是一个提供机器学习的算法库,比如它提供的深度学习的算法包就特小,速度快。
神经网络:
这是我见过的比较简洁的网络模型写法、实现,所有的包编译完不到80M。用于人脸检测的模型参数一共十几kb。比如说人脸检测的模型是这么写的:
template <long num_filters, typename SUBNET> using con5d = dlib::con<num_filters,5,5,2,2,SUBNET>;
template <long num_filters, typename SUBNET> using con5 = dlib::con<num_filters,5,5,1,1,SUBNET>;
template <typename SUBNET> using downsampler = dlib::relu<dlib::affine<con5d<32, dlib::relu<
dlib::affine<con5d<32, dlib::relu<dlib::affine<con5d<16,SUBNET>>>>>>>>>;
template <typename SUBNET> using rcon5 = dlib::relu<dlib::affine<con5<45,SUBNET>>>;
using net_type = dlib::loss_mmod<dlib::con<1,9,9,1,1,rcon5<rcon5<rcon5<
downsampler<dlib::input_rgb_image_pyramid<dlib::pyramid_down<6>>>>>>>>;
里面可直接看得到的有卷积convolution、激活层relu;affine是不是仿射变换,一种归一化层没去细究。这代码可以在其官网的示例中看到(http://dlib.net/dnn_mmod_face_detection_ex.cpp.html),其中示例中列车了网络参数的下载地址。
网络模型加载:
string path_to_detector = "mmod_human_face_detector.dat";
net_type net;
dlib::deserialize(path_to_detector) >> net;
结果的输出和读取:
vector<Rect> faces;
auto dets = net(cimg_small);
cout<<dets.size();
for (auto det : dets){//type(rt):mmod_rect
dlib::rectangle rt=det.rect;
long tl_x, tl_y;
unsigned long h, w;
tl_x, tl_yrt.left(), rt.top();
h ,w = rt.height(),rt.width();
cout<<"rt ->left():"<<rt.left();
faces.push_back(Rect(tl_x,tl_y,w,h));
}
输出的dets类型,可以通过查看dlib/dnn/loss来查看,类型vector<mmod_rect>;而相应的mmod_rect是在dlib/image_processing/full_object_detection.h:132: struct mmod_rect。
数据格式的转换:
dlib::cv_image<dlib::bgr_pixel> cimg_small(image);
这个可以参考这个链接。相关的文件在dlib/opencv/cv_image.h中,也是因为这,dlib和cv的namespace不能同时使用。
或者下面这个链接的做法更直接一点,链接
matrix<rgb_pixel> img;
cv::Mat image = cv::imread(path);
array2d< bgr_pixel> arrimg(image.rows, image.cols);
dlib::assign_image(img, cv_image<rgb_pixel>(image));
更多的转换方法,参考这里:Dlib格式与Opencv之间的转化
更多说明:一份好的教程能说明一切,dlib有一份非常好的教程,在tools/python/src/cnn_face_detector.cpp:18:class cnn_face_detection_model_v1中,里面有一整套的处理方式
编译:
undefined reference to `dlib::tt::add(float, dlib::tensor&, float, dlib::tensor const&)'
//在dlib/cudnn/tensor_tool下面没有找到add。在http://dlib.net/最新版的里面却又相应的文件。所以重新编译安装。
//其中,搜寻namesapce tt可以定位到上面说的文件
编译时,需要用到cmake,在dlib有个例子,https://github.com/davisking/dlib#compiling-your-own-c-programs-that-use-dlib。里面有用的只有三行。
add_subdirectory(../dlib dlib_build)
#重点是这儿,第一个是头文件的位置,比如相对于dlib/example下的cmake来说,其像对位置是父文件夹下的dlib,所以在自己工程里面写上绝对路径;
#第二个文件是在build下的编译文件,比如在dlib中的文件路径是example/build/dlib_build
add_executable(assignment_learning_ex assignment_learning_ex.cpp)
target_link_libraries(assignment_learning_ex dlib::dlib)
#这俩是常规的编译文件,不过注意target那个第二个变量
一份Demo:
//tools/python/src/cnn_face_detector.cpp:18:class cnn_face_detection_model_v1
vector<Rect> detect (Mat img, const int upsample_num_times ){
vector<Rect> faces;
dlib::matrix<dlib::rgb_pixel> image;
dlib::assign_image(image, dlib::cv_image<dlib::rgb_pixel>(img));
//matrix<dlib::rgb_pixel> ;
//dlib::load_image(cimg_small, inputName);
string path_to_detector = "mmod_human_face_detector.dat";
net_type net;
dlib::deserialize(path_to_detector) >> net;
// Upsampling the image will allow us to detect smaller faces but will cause the
// program to use more RAM and run longer.
unsigned int levels = upsample_num_times;
dlib::pyramid_down<2> pyr;
while (levels > 0)
{
levels--;
pyramid_up(image, pyr);
}
auto dets = net(image);
cout<<"dets.size()"<<dets.size()<<"img.size():"<<image.size()<<
"img.nr():"<<image.nr()<<"img.nc():"<<image.nc()<<endl;
for (auto det : dets){//type(rt):mmod_rect
dlib::rectangle rt = pyr.rect_down(det.rect, upsample_num_times);
long tl_x, tl_y;
unsigned long h, w;
tl_x = rt.left()>0? rt.left(): 0;
tl_y = rt.top()>0? rt.top() : 0;
h = rt.height();
w = rt.width();
cout<<"rt ->left():"<<tl_x<<"y "<<tl_y<<"h "<< h <<"w "<< w <<endl;
faces.push_back(Rect(tl_x,tl_y,w,h));
}
return faces;
}