前言
- 多参考官方文档
- 遇到bug多求助Google,总有人遇到过同样的问题(都1202年了,学会科学上网搞科研、搞项目不用多说了吧。注意:外面的世界太复杂,不发表言论专注学习就好)
- 很多人喜欢用Python搞AprilTags,我也喜欢,但是Python的实时性没法跟C++比,所以还是要硬着头皮搞C++
环境配置
Tutorial: Installation from source for Windows with Visual C++ 2017 (vc15)
官方文档永远是最可靠、最全面的。
配置的时候与官方配置的版本对应上,报错能少老多了。
里面还有其他操作系统的环境配置方法,不过我没有测试过。
报错记录
按照官方文档操作时,我遇到过一个坑,就是在cmake VISP库时,提示找不到Python之类的error,这时需要安装Python2.7。
Python下载
Python2.7安装教程
安装教程里面的系统Path环境变量添加非常重要,这里重点强调一下。
再安装pip
在Windows 10环境中安装Python 2.7.15的pip
之后安装numpy库
win+R快捷键打开cmd
命令框中输入
pip install numpy
再进行cmake就没有报错啦~
相关C++代码介绍
官方文档有提供典型案例,先把官方案例中的关键代码看懂,然后学着官方案例写,满足基本要求应该是没问题的。
个人感觉比较有帮助的几个案例:
- 使用openCV抓取视频
grabOpenCV.cpp
其中读取视频的代码是
cv::VideoCapture cap(device); // open the default camera
如果我们要读取本地的视频文件,只需要将cap中的device改成视频文件的相对路径
cv::VideoCapture cap("test_3.mp4");
- 读取图片并识别AprilTags
tutorial-apriltag-detector.cpp
修改读取图片的相对路径(jpg、png都可以)
std::string input_filename = "AprilTag.pgm";
修改AprilTags的大小
double tagSize = 0.053;
这里很多地方都没有说明,经过我的测试,二维码的尺寸指的是内圈尺寸,如下图所示,代码中的单位为m
修改相机内参
cam.initPersProjWithoutDistortion(615.1674805, 615.1675415, 312.1889954, 243.4373779);
这里需要标定相机。这里用的是单目手机相机做的测试,标定后会得到一个内参矩阵,按数字标号顺序就是需要修改的四个相机内参数据
- 读取视频并识别AprilTags
tutorial-apriltag-detector-live.cpp
终于到重点了,这里把我测试用的代码贴出来
#include <visp3/core/vpConfig.h>
#include <stdio.h>
#include <math.h>
#ifdef VISP_HAVE_MODULE_SENSOR
#include <visp3/sensor/vpV4l2Grabber.h>
#include <visp3/sensor/vp1394CMUGrabber.h>
#include <visp3/sensor/vp1394TwoGrabber.h>
#include <visp3/sensor/vpFlyCaptureGrabber.h>
#include <visp3/sensor/vpRealSense2.h>
#endif
#include <visp3/detection/vpDetectorAprilTag.h>
#include <visp3/gui/vpDisplayGDI.h>
#include <visp3/gui/vpDisplayOpenCV.h>
#include <visp3/gui/vpDisplayX.h>
#include <visp3/core/vpXmlParserCamera.h>
//#undef VISP_HAVE_V4L2
//#undef VISP_HAVE_DC1394
//#undef VISP_HAVE_CMU1394
//#undef VISP_HAVE_FLYCAPTURE
//#undef VISP_HAVE_REALSENSE2
//#undef VISP_HAVE_OPENCV
double filter_dtmean(double m_data);
const int N_dtm = 4; //递推平均滤波的窗口大小
int i_dtm = 0;
double value_buf[N_dtm + 1] = { 0 };
int main(int argc, const char **argv)
{
#if defined(VISP_HAVE_APRILTAG) && \
(defined(VISP_HAVE_V4L2) || defined(VISP_HAVE_DC1394) || defined(VISP_HAVE_CMU1394) || (VISP_HAVE_OPENCV_VERSION >= 0x020100) || \
defined(VISP_HAVE_FLYCAPTURE) || defined(VISP_HAVE_REALSENSE2) )
int opt_device = 0; // For OpenCV and V4l2 grabber to set the camera device
vpDetectorAprilTag::vpAprilTagFamily tagFamily = vpDetectorAprilTag::TAG_36h11;
vpDetectorAprilTag::vpPoseEstimationMethod poseEstimationMethod = vpDetectorAprilTag::HOMOGRAPHY_VIRTUAL_VS;
double tagSize = 0.05352;
float quad_decimate = 6.0;
int nThreads = 1;
std::string intrinsic_file = "";
std::string camera_name = "";
bool display_tag = false;
int color_id = -1;
unsigned int thickness = 2;
bool align_frame = false;
#if !(defined(VISP_HAVE_X11) || defined(VISP_HAVE_GDI) || defined(VISP_HAVE_OPENCV))
bool display_off = true;
std::cout << "Warning: There is no 3rd party (X11, GDI or openCV) to dislay images..." << std::endl;
#else
bool display_off = false;
#endif
vpImage<unsigned char> I;
for (int i = 1; i < argc; i++) {
if (std::string(argv[i]) == "--pose_method" && i + 1 < argc) {
poseEstimationMethod = (vpDetectorAprilTag::vpPoseEstimationMethod)atoi(argv[i + 1]);
}
else if (std::string(argv[i]) == "--tag_size" && i + 1 < argc) {
tagSize = atof(argv[i + 1]);
}
else if (std::string(argv[i]) == "--camera_device" && i + 1 < argc) {
opt_device = atoi(argv[i + 1]);
}
else if (std::string(argv[i]) == "--quad_decimate" && i + 1 < argc) {
quad_decimate = (float)atof(argv[i + 1]);
}
else if (std::string(argv[i]) == "--nthreads" && i + 1 < argc) {
nThreads = atoi(argv[i + 1]);
}
else if (std::string(argv[i]) == "--intrinsic" && i + 1 < argc) {
intrinsic_file = std::string(argv[i + 1]);
}
else if (std::string(argv[i]) == "--camera_name" && i + 1 < argc) {
camera_name = std::string(argv[i + 1]);
}
else if (std::string(argv[i]) == "--display_tag") {
display_tag = true;
}
else if (std::string(argv[i]) == "--display_off") {
display_off = true;
}
else if (std::string(argv[i]) == "--color" && i + 1 < argc) {
color_id = atoi(argv[i + 1]);
}
else if (std::string(argv[i]) == "--thickness" && i + 1 < argc) {
thickness = (unsigned int)atoi(argv[i + 1]);
}
else if (std::string(argv[i]) == "--tag_family" && i + 1 < argc) {
tagFamily = (vpDetectorAprilTag::vpAprilTagFamily)atoi(argv[i + 1]);
}
else if (std::string(argv[i]) == "--z_aligned") {
align_frame = true;
}
else if (std::string(argv[i]) == "--help" || std::string(argv[i]) == "-h") {
std::cout << "Usage: " << argv[0]
<< " [--camera_device <camera device> (default: 0)]"
<< " [--tag_size <tag_size in m> (default: 0.053)]"
" [--quad_decimate <quad_decimate> (default: 1)]"
" [--nthreads <nb> (default: 1)]"
" [--intrinsic <intrinsic file> (default: empty)]"
" [--camera_name <camera name> (default: empty)]"
" [--pose_method <method> (0: HOMOGRAPHY, 1: HOMOGRAPHY_VIRTUAL_VS, "
" 2: DEMENTHON_VIRTUAL_VS, 3: LAGRANGE_VIRTUAL_VS, "
" 4: BEST_RESIDUAL_VIRTUAL_VS, 5: HOMOGRAPHY_ORTHOGONAL_ITERATION) (default: 0)]"
" [--tag_family <family> (0: TAG_36h11, 1: TAG_36h10 (DEPRECATED), 2: TAG_36ARTOOLKIT (DEPRECATED),"
" 3: TAG_25h9, 4: TAG_25h7 (DEPRECATED), 5: TAG_16h5, 6: TAG_CIRCLE21h7, 7: TAG_CIRCLE49h12,"
" 8: TAG_CUSTOM48h12, 9: TAG_STANDARD41h12, 10: TAG_STANDARD52h13) (default: 0)]"
" [--display_tag] [--z_aligned]";
#if (defined(VISP_HAVE_X11) || defined(VISP_HAVE_GDI) || defined(VISP_HAVE_OPENCV))
std::cout << " [--display_off] [--color <color id>] [--thickness <line thickness>]";
#endif
std::cout << " [--help]" << std::endl;
return EXIT_SUCCESS;
}
}
try {
vpCameraParameters cam;
cam.initPersProjWithoutDistortion(820.01708389, 820.15035204, 442.48526619, 247.86602286);
vpXmlParserCamera parser;
if (!intrinsic_file.empty() && !camera_name.empty())
parser.parse(cam, intrinsic_file, camera_name, vpCameraParameters::perspectiveProjWithoutDistortion);
#if defined(VISP_HAVE_V4L2)
vpV4l2Grabber g;
std::ostringstream device;
device << "/dev/video" << opt_device;
std::cout << "Use Video 4 Linux grabber on device " << device.str() << std::endl;
g.setDevice(device.str());
g.setScale(1);
g.open(I);
#elif defined(VISP_HAVE_DC1394)
(void)opt_device; // To avoid non used warning
std::cout << "Use DC1394 grabber" << std::endl;
vp1394TwoGrabber g;
g.open(I);
#elif defined(VISP_HAVE_CMU1394)
(void)opt_device; // To avoid non used warning
std::cout << "Use CMU1394 grabber" << std::endl;
vp1394CMUGrabber g;
g.open(I);
#elif defined(VISP_HAVE_FLYCAPTURE)
(void)opt_device; // To avoid non used warning
std::cout << "Use FlyCapture grabber" << std::endl;
vpFlyCaptureGrabber g;
g.open(I);
#elif defined(VISP_HAVE_REALSENSE2)
(void)opt_device; // To avoid non used warning
std::cout << "Use Realsense 2 grabber" << std::endl;
vpRealSense2 g;
rs2::config config;
config.disable_stream(RS2_STREAM_DEPTH);
config.disable_stream(RS2_STREAM_INFRARED);
config.enable_stream(RS2_STREAM_COLOR, 640, 480, RS2_FORMAT_RGBA8, 30);
g.open(config);
g.acquire(I);
std::cout << "Read camera parameters from Realsense device" << std::endl;
cam = g.getCameraParameters(RS2_STREAM_COLOR, vpCameraParameters::perspectiveProjWithoutDistortion);
#elif defined(VISP_HAVE_OPENCV)
std::cout << "Use OpenCV grabber on device " << opt_device << std::endl;
//cv::VideoCapture g(opt_device); // Open the default camera
cv::VideoCapture g("test_3.mp4"); // 打开"build"目录下的视频文件
if (!g.isOpened()) { // Check if we succeeded
std::cout << "Failed to open the camera" << std::endl;
return -1;
}
cv::Mat frame;
g >> frame; // get a new frame from camera
vpImageConvert::convert(frame, I);
#endif
std::cout << cam << std::endl;
std::cout << "poseEstimationMethod: " << poseEstimationMethod << std::endl;
std::cout << "tagFamily: " << tagFamily << std::endl;
std::cout << "nThreads : " << nThreads << std::endl;
std::cout << "Z aligned: " << align_frame << std::endl;
vpDisplay *d = NULL;
if (!display_off) {
#ifdef VISP_HAVE_X11
d = new vpDisplayX(I);
#elif defined(VISP_HAVE_GDI)
d = new vpDisplayGDI(I);
#elif defined(VISP_HAVE_OPENCV)
d = new vpDisplayOpenCV(I);
#endif
}
vpDetectorAprilTag detector(tagFamily);
detector.setAprilTagQuadDecimate(quad_decimate);
detector.setAprilTagPoseEstimationMethod(poseEstimationMethod);
detector.setAprilTagNbThreads(nThreads);
detector.setDisplayTag(display_tag, color_id < 0 ? vpColor::none : vpColor::getColor(color_id), thickness);
detector.setZAlignedWithCameraAxis(align_frame);
std::vector<double> time_vec;
for (;;) {
#if defined(VISP_HAVE_V4L2) || defined(VISP_HAVE_DC1394) || defined(VISP_HAVE_CMU1394) || defined(VISP_HAVE_FLYCAPTURE) || defined(VISP_HAVE_REALSENSE2)
g.acquire(I);
#elif defined(VISP_HAVE_OPENCV)
g >> frame;
vpImageConvert::convert(frame, I);
#endif
vpDisplay::display(I);
double t = vpTime::measureTimeMs();
std::vector<vpHomogeneousMatrix> cMo_vec;
detector.detect(I, tagSize, cam, cMo_vec);
t = vpTime::measureTimeMs() - t;
time_vec.push_back(t);
std::stringstream ss;
ss << "Detection time: " << t << " ms for " << detector.getNbObjects() << " tags";
vpDisplay::displayText(I, 40, 20, ss.str(), vpColor::red);
for (size_t i = 0; i < cMo_vec.size(); i++) {
vpDisplay::displayFrame(I, cMo_vec[i], cam, tagSize / 2, vpColor::none, 3);
std::cout << " Message: \"" << detector.getMessage(i) << "\"" << std::endl;
std::cout << " Pose: " << vpPoseVector(cMo_vec[i]).t() << std::endl;
//std::cout << "M:\n" << cMo_vec[i] << std::endl;
//std::cout << "R:\n" << vpRotationMatrix(cMo_vec[i]) << std::endl;
//读取pose的前三个数据,分别是x,y,z方向的距离,分别平方求和之后开根号得到相机坐标系与二维码坐标系的直线距离
double distance = sqrt(pow(vpPoseVector(cMo_vec[i]).t().data[0], 2)
+ pow(vpPoseVector(cMo_vec[i]).t().data[1], 2)
+ pow(vpPoseVector(cMo_vec[i]).t().data[2], 2));
//std::cout << "Distance: " << distance << std::endl;
//由于测试的相机固定在轨道车上,因此视频会有抖动,这里加的滤波是为了一定程度的消除距离数据的抖动,正常情况下不需要加滤波
std::cout << "递推平均滤波: " << filter_dtmean(distance) << std::endl;
}
vpDisplay::displayText(I, 20, 20, "Click to quit.", vpColor::red);
vpDisplay::flush(I);
if (vpDisplay::getClick(I, false))
break;
}
std::cout << "Benchmark computation time" << std::endl;
std::cout << "Mean / Median / Std: " << vpMath::getMean(time_vec) << " ms" //计算 double 向量的平均值
<< " ; " << vpMath::getMedian(time_vec) << " ms" //计算 double 向量的中值
<< " ; " << vpMath::getStdev(time_vec) << " ms" << std::endl; //计算 double 向量的标准偏差值
if (!display_off)
delete d;
}
catch (const vpException &e) {
std::cerr << "Catch an exception: " << e.getMessage() << std::endl;
}
return EXIT_SUCCESS;
#else
(void)argc;
(void)argv;
#ifndef VISP_HAVE_APRILTAG
std::cout << "Enable Apriltag support, configure and build ViSP to run this tutorial" << std::endl;
#else
std::cout << "Install a 3rd party dedicated to frame grabbing (dc1394, cmu1394, v4l2, OpenCV, FlyCapture, Realsense2), configure and build ViSP again to use this example" << std::endl;
#endif
#endif
return EXIT_SUCCESS;
}
//递推平均滤波函数
double filter_dtmean(double m_data) {
double sum = 0;
int count;
value_buf[N_dtm] = m_data;
std::cout << "采样第" << i_dtm++ << "次的距离是:" << value_buf[N_dtm] << std::endl;
for (count = 0; count < N_dtm; count++)
{
value_buf[count] = value_buf[count + 1];
sum += value_buf[count];
}
//std::cout << "均值为:" << (double)(sum / N_dtm) << std::endl;
return (double)(sum / N_dtm);
}
这里依然需要修改AprilTags的大小、修改视频相对路径、修改相机内参,跟上文读取图片识别AprilTags说的一样
代码参数说明
这里主要说明上文读取视频识别AprilTags的代码,除了代码中的注释外,我感觉还需要些额外说明,才能让思路更加明朗。
上文重点的代码是
std::cout << " Message: \"" << detector.getMessage(i) << "\"" << std::endl;
std::cout << " Pose: " << vpPoseVector(cMo_vec[i]).t() << std::endl;
//std::cout << "M:\n" << cMo_vec[i] << std::endl;
//std::cout << "R:\n" << vpRotationMatrix(cMo_vec[i]) << std::endl;
//读取pose的前三个数据,分别是x,y,z方向的距离,分别平方求和之后开根号得到相机坐标系与二维码坐标系的直线距离
double distance = sqrt(pow(vpPoseVector(cMo_vec[i]).t().data[0], 2)
+ pow(vpPoseVector(cMo_vec[i]).t().data[1], 2)
+ pow(vpPoseVector(cMo_vec[i]).t().data[2], 2));
//std::cout << "Distance: " << distance << std::endl;
vpPoseVector() 返回六个数,表示二维码坐标系与相机坐标系之间的位姿关系,前三个数表示位置,分别是xyz,单位是m,后三个数表示姿态,分别是rpy,单位是rad。
.t() 是转置。
运行结果
运行速度
很多人跟着官方文档做出来后,发现运行速度很慢,我是通过增大quad_decimate参数后,读取速度有了飞速的提高。官方参数是1.0,相当于原图处理,速度会慢很多
float quad_decimate = 6.0;
修改参数后,480p视频单帧处理2ms左右,1080p视频单帧处理10ms左右。
挖坑:如果后续有时间,会把Python标定相机和AprilTags其他参数意义记录一下