前段时间用pytorch训练的openpose,遇到各种问题,最后好算也是跑起来了,感谢各位神仙的帮助!但是最后意识到用pytorch训练出来的权重在转caffemodel的时候或许存在精度损失的问题,获取根本没法转换,所以在尝试采用caffe训练openpsoe,下面说一下安装训练openpose用的caffe的各种奇葩问题
一、cudnn.hpp问题
怎么说也是安装caffe的老司机了,但是当安装这个得时候还是遇到了坑。按照安装caffe的方式建立build文件,然后cmake,make的时候遇到的第一个问题就是:
In file included from ./include/caffe/util/device_alternate.hpp:40:0,
from ./include/caffe/common.hpp:19,
from src/caffe/syncedmem.cpp:1:
./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::setConvolutionDesc(cudnnConvolutionStruct**, cudnnTensorDescriptor_t, cudnnFilterDescriptor_t, int, int, int, int)’:
./include/caffe/util/cudnn.hpp:112:3: error: too few arguments to function ‘cudnnStatus_t cudnnSetConvolution2dDescriptor(cudnnConvolutionDescriptor_t, int, int, int, int, int, int, cudnnConvolutionMode_t, cudnnDataType_t)’
CUDNN_CHECK(cudnnSetConvolution2dDescriptor(*conv,
^
In file included from ./include/caffe/util/cudnn.hpp:5:0,
from ./include/caffe/util/device_alternate.hpp:40,
from ./include/caffe/common.hpp:19,
from src/caffe/syncedmem.cpp:1:
/usr/local/cuda/include/cudnn.h:537:27: note: declared here
cudnnStatus_t CUDNNWINAPI cudnnSetConvolution2dDescriptor( cudnnConvolutionDescriptor_t convDesc,
^
Makefile:579: recipe for target '.build_release/src/caffe/syncedmem.o' failed
make: *** [.build_release/src/caffe/syncedmem.o] Error 1
make: *** Waiting for unfinished jobs....
问题描述的error: too few arguments to function ‘cudnnStatus_t cudnnSetConvolution2dDescriptor
解决方法:将include/caffe/util/底下的cudnn.hpp文件用之前编译好的caffe里的cudnn.cpp文件替换,原因分析:应该是作者当时编译caffe用的cudnn版本过低,与目前本机上的cudnn版本冲突导致
二、Caffe.build_release/lib/libcaffe.so.1.0.0-rc3' failed 找不到cv:aply文件,大概是这个意思。利用pkg-config -modversion opencv查看了当前系统下的opencv版本是2.4.9.1,因为这已经是apt-get install的最新版本了,所以不能按照之前的方式 sudo apt-get install libopencv-dev python-opencv更新opencv了,只能下载opencv的安装包并自己编译自己安装。
#install opencv3.4.1
百度云的下载连接:链接:https://pan.baidu.com/s/1A9xgsglLipBt8BLXQfAP9A 提取码:1an3
复制这段内容后打开百度网盘手机App,操作更方便哦
mkdir build # 创建编译的文件目录
cd build
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
make -j20 #编译
编译成功后安装:
sudo make install #安装
安装完成后利用opencv_version或者pkg-config --modversion opencv都能检测当前的版本
三、最后这个问题,(明明已经安装了opencv的3.4版本,但是安装caffe时cmake的时候确实链接的2.4.9.1)
安装完成后cmake的时候链接的opencv是2.4.9版本,捯饬了很久没有捯饬成3.4,最后直接强制编译又出现的了新的错误,真是头大了(/usr/include/opencv2/contrib/contrib.hpp:760:9: error: ‘vector’ does not name a type)
/usr/include/opencv2/contrib/contrib.hpp:561:42: error: ‘vector’ has not been declared
CV_OUT vector<vector<Point> >& results, CV_OUT vector<float>& cost,
^
/usr/include/opencv2/contrib/contrib.hpp:561:48: error: expected ‘,’ or ‘...’ before ‘<’ token
CV_OUT vector<vector<Point> >& results, CV_OUT vector<float>& cost,
^
/usr/include/opencv2/contrib/contrib.hpp:760:9: error: ‘vector’ does not name a type
vector<int> Rsr;
^
/usr/include/opencv2/contrib/contrib.hpp:761:9: error: ‘vector’ does not name a type
vector<int> Csr;
^
/usr/include/opencv2/contrib/contrib.hpp:762:9: error: ‘vector’ does not name a type
vector<double> Wsr;
^
/usr/include/opencv2/contrib/contrib.hpp:771:13: error: ‘vector’ does not name a type
vector<double> weights;
^
/usr/include/opencv2/contrib/contrib.hpp:777:9: error: ‘vector’ does not name a type
vector<kernel> w_ker_2D;
一大片的error,真的是重装系统的心都有。还好有大神之前碰到过,感谢https://blog.csdn.net/qq_38469553/article/details/82424441
解决办法:
解决方法很简单,将这个用于训练openpose模型的caffe源码中的包涵contrib.hpp文件的这一行注释掉:
例如你的这个出现问题的caffe的目录为:~/caffe-pose-train 则在该目录下:
gedit src/caffe/cpm_data_transformer.cpp
然后将
#include <opencv2/contrib/contrib.hpp>
最后在build底下make还是出现cv的某个函数没有定义的错误。
但是确定原因肯定是找不到最新安装的opencv3导致的,后来在各大网站上搜索,也找不到解决方法,真的是卸载了装,装了卸的各种神方法都尝试了。
最后皇天不负有心人,受到了这篇文章的启发https://blog.csdn.net/xidaoliang/article/details/88576465
在cmake-gui的设置下修改opecv的连接路径
cmake,哈哈哈。终于连接到了opencv到了3.4了。搞定,下班
接下来要安装matalb搞数据整理