OpenCV4机器学习算法原理与编程实战——书中代码解析

lazyoneguy

已于 2022-11-23 14:18:27 修改

阅读量1.1k

点赞数 1

分类专栏： OpenCV 文章标签：算法 opencv

于 2022-11-23 14:02:37 首次发布

本文链接：https://blog.csdn.net/p3116002589/article/details/127999187

版权

OpenCV 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

OpenCV4机器学习算法原理与编程实战

环境配置

本项目使用vscode搭建OpenCv（C++开发环境)进行学习；
环境搭建可参考 VScode搭建Opencv（C++开发环境）博客；
书中某些代码因为版本原因需要改变，详情可具体到相应文件中；
本项目地址：https://gitee.com/lazyone/opencv-learning.git；
项目不定期更新。

1. 基本数据类型

OpenCV基本数据类型可分为三大类

（1）基本数据类型：该类型直接由C++的数据类型（int或float等）组装起来，包括简单的向量和矩阵；

（2）助手对象：

（3）大型数组类型：包括cv::Mat、cv::Vec、cv::Point、cv::Scalar、cv::Size和cv::Rect、cv::RotatedRect等等，不同类型有不同的构造方式和操作，书中代码例子详细可以查看第一章cpp里面内容。

cv::Vec-向量

固定向量类，需要提前知道变量长度
模板，可以存储任意数据类型
存在多种预定义的类型（其他数据类型也有相关特性）


// 源码中的预定义
/** @name Shorter aliases for the most popular specializations of Vec<T,n>
  @{
*/
typedef Vec<uchar, 2> Vec2b;
typedef Vec<uchar, 3> Vec3b;
typedef Vec<uchar, 4> Vec4b;

typedef Vec<short, 2> Vec2s;
typedef Vec<short, 3> Vec3s;
typedef Vec<short, 4> Vec4s;

typedef Vec<ushort, 2> Vec2w;
typedef Vec<ushort, 3> Vec3w;
typedef Vec<ushort, 4> Vec4w;

typedef Vec<int, 2> Vec2i;
typedef Vec<int, 3> Vec3i;
typedef Vec<int, 4> Vec4i;
typedef Vec<int, 6> Vec6i;
typedef Vec<int, 8> Vec8i;

typedef Vec<float, 2> Vec2f;
typedef Vec<float, 3> Vec3f;
typedef Vec<float, 4> Vec4f;
typedef Vec<float, 6> Vec6f;

typedef Vec<double, 2> Vec2d;
typedef Vec<double, 3> Vec3d;
typedef Vec<double, 4> Vec4d;
typedef Vec<double, 6> Vec6d;
/** @} */

因此使用时可直接使用，例如

cv::Vec<int, 3> myVec; // 常规使用
cv::Vec3i v3i(0,1,2);  // 3个int类型向量
cv::Vec2d v2d(1.2,2.4);// 2个double类型向量
cv::Vec2d v2d_1(v2d);  // 复制构造
cv::Vec3f v1(1,0,0);   // 三个浮点型
cv::Vec3f v2(1,1,0);

重载的操作符包括+ - * += -= == !=。

cv::Point

存放2个或3个数据的容器
存放的数据可以通过命名变量访问（myPonit.x、myPoint.y等）

// 预定义
typedef Point_<int> Point2i;
typedef Point_<int64> Point2l;
typedef Point_<float> Point2f;
typedef Point_<double> Point2d;
typedef Point2i Point;

typedef Point3_<int> Point3i;
typedef Point3_<float> Point3f;
typedef Point3_<double> Point3d;
// 其他操作可通过源码查看，这就不详细展示了

cv::Scalar

四维双精度向量类

cv::Size

尺寸类，可与cv::Point类相互转换，其区别是cv::Point的成员变量名为x和y，但cv::Size的成员变量名是width和height

cv::Rect类

矩形类，包括Point中的x和y，Size中的width和height
构造方式多样可查看源码
重载的操作包括&, &=, |, |=, +, +=, ==, !=


// 源码中的构造
template<typename _Tp> inline
Rect_<_Tp>::Rect_()
    : x(0), y(0), width(0), height(0) {}

template<typename _Tp> inline
Rect_<_Tp>::Rect_(_Tp _x, _Tp _y, _Tp _width, _Tp _height)
    : x(_x), y(_y), width(_width), height(_height) {}

template<typename _Tp> inline
Rect_<_Tp>::Rect_(const Rect_<_Tp>& r)
    : x(r.x), y(r.y), width(r.width), height(r.height) {}

template<typename _Tp> inline
Rect_<_Tp>::Rect_(Rect_<_Tp>&& r) CV_NOEXCEPT
    : x(std::move(r.x)), y(std::move(r.y)), width(std::move(r.width)), height(std::move(r.height)) {}

template<typename _Tp> inline
Rect_<_Tp>::Rect_(const Point_<_Tp>& org, const Size_<_Tp>& sz)
    : x(org.x), y(org.y), width(sz.width), height(sz.height) {}

template<typename _Tp> inline
Rect_<_Tp>::Rect_(const Point_<_Tp>& pt1, const Point_<_Tp>& pt2)
{
    x = std::min(pt1.x, pt2.x);
    y = std::min(pt1.y, pt2.y);
    width = std::max(pt1.x, pt2.x) - x;
    height = std::max(pt1.y, pt2.y) - y;
}

cv::RotatedRect

有向矩形类，包含一个名为center的Point2f变量，一个名为size的cv::Size2f变量，以及一个名为angle的浮点数变量（可以旋转了）
与cv::Rect的区别：cv::RotatedRect类的位置相对于其center，而cv::Rect类的位置相对于其左上角。

cv::Mat（最常用的了）

数组类，表示任意维度的稠密数组（读取的图像就是mat）
与其对应的稀疏数组：cv::SparseMat
构造方式多种多样，可以查看源码，这里就不一一介绍了
数据类型 = 类型+通道数，例如：CV _{8U, 16S, 16U, 32S, 32F, 64F}C{1,2,3}的组合，CV_8UC3表示一个三通道无符号整型数据。
还存特殊成员函数，例如cv::Mat::one或cv::Mat::eye等等

2. 基本图像操作

2.1 基本图像操作

书中给出了OpenCV3.0之后版本头文件的引用方式，以及相关头文件的作用。

头文件名	功能描述
#include "opencv2/core/core_c.h	老版本C语言数据结构和数学运算
##include “opencv2/core/core.hpp”	新版本C++语言数据结构和数学运算
#include “opencv2/flann/miniflann.hpp”	近似最近邻匹配函数
#include “opencv2/imgproc/imgproc_c.h”	老版本c语言图像处理函数
#include “opencv2/imgproc.hpp”	新版本C++语言图像处理函数
#include “opencv2 /video/background_segm.hpp.hpp”	定义背景减余算法接口
#include “opencv2/video/tracking.hpp”	CamShift,meanshift, OpticalFlow等视频跟踪算法的接口
#include “opencv2/video.hpp”	包含视频跟踪和背景分割运算的头文件
#include "opencv2/features2d. hpp’	"二维图像特征检测器和描述符提取器的抽象基类
#include “opencv2/objdetect.hpp”	“Cascade脸部检测、 latent-SVM分类器、HOG特征和planar patch检测器支持函数”
#include “opencv2/calib3d.hpp”	相机标定和立体视觉
#include “opencv2/ml.hpp”	机器学习算法与数据集封装
#include "opencv2/highgui/highgui_c.h’	新版本c语言图像显示、滑动条、按钮、鼠标交互和I/O
#include “opencv2/highgui.hpp”	新版本C++语言图像显示、滑动条、按钮、鼠标交互和I/O

2.2 读取显示与存储

cv::imread

imread：从文件中读取图像并转换成mat数据，flag可指定格式

// 源码
CV_EXPORTS_W Mat imread( const String& filename, int flags = IMREAD_COLOR );
/*
filename:文件路径
flags：ImreadModes 读取时就可以转化文件格式。
//! Imread flags
enum ImreadModes {
       IMREAD_UNCHANGED            = -1, //!< If set, return the loaded image as is (with alpha channel, otherwise it gets cropped). Ignore EXIF orientation.
       IMREAD_GRAYSCALE            = 0,  //!< If set, always convert image to the single channel grayscale image (codec internal conversion).
       IMREAD_COLOR                = 1,  //!< If set, always convert image to the 3 channel BGR color image.
       IMREAD_ANYDEPTH             = 2,  //!< If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
       IMREAD_ANYCOLOR             = 4,  //!< If set, the image is read in any possible color format.
       IMREAD_LOAD_GDAL            = 8,  //!< If set, use the gdal driver for loading the image.
       IMREAD_REDUCED_GRAYSCALE_2  = 16, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/2.
       IMREAD_REDUCED_COLOR_2      = 17, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/2.
       IMREAD_REDUCED_GRAYSCALE_4  = 32, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/4.
       IMREAD_REDUCED_COLOR_4      = 33, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/4.
       IMREAD_REDUCED_GRAYSCALE_8  = 64, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/8.
       IMREAD_REDUCED_COLOR_8      = 65, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/8.
       IMREAD_IGNORE_ORIENTATION   = 128 //!< If set, do not rotate the image according to EXIF's orientation flag.
     };
*/

cv::imshow

imshow：生成一个窗口展示图片，一般需要搭配其他函数使用

CV_EXPORTS_W void imshow(const String& winname, InputArray mat);

/*
winname:窗口名
mat：需要展示的图像数据
*/
/** @brief Resizes window to the specified size

@note

-   The specified window size is for the image area. Toolbars are not counted.
-   Only windows created without cv::WINDOW_AUTOSIZE flag can be resized.

@param winname Window name.
@param width The new window width.
@param height The new window height.
 */

cv::imwrite

imwrite：保存图片数据。

CV_EXPORTS_W bool imwrite( const String& filename, InputArray img,
              const std::vector<int>& params = std::vector<int>());
/*
filename:要保存的文件名，包含文件扩展名。
img:待存储的图像。
params:与存储格式相关的参数对，如压缩比等。
参数类型为std::vector<int> 。
参数的构成方式如下:(paramId_l,paramValue_1,paramId_2,paramValue_2,…)
*/
/** @brief Reads an image from a buffer in memory.

The function imdecode reads an image from the specified buffer in the memory. If the buffer is too short or
contains invalid data, the function returns an empty matrix ( Mat::data==NULL ).

See cv::imread for the list of supported formats and flags description.

@note In the case of color images, the decoded images will have the channels stored in **B G R** order.
@param buf Input array or vector of bytes.
@param flags The same flags as in cv::imread, see cv::ImreadModes.
*/

2.3颜色空间转换

颜色空间又称为彩色模型或彩色系统，其目的是在某些标准下用通常可以接受的方式方便地对彩色进行说明。在数字图像处理中，实际上最通用的面向硬件的是RGB（红、绿、蓝）空间，该空间用于彩色监视器和一大类彩色摄像机;CMY（青、洋红、黄）空间和CMYK（青、洋红、黄、黑）空间是针对彩色打印机的;HSI（色调、饱和度、亮度）空间是一种更符合人描述和解释颜色的一种模型。——《OpenCV4机器学习算法原理与编程实战》

具体的颜色空间介绍可以去看看书中描述和其他博客，这里就不多介绍了。

cv::cvtColor

cvtColor：转换颜色空间

CV_EXPORTS_W void cvtColor( InputArray src, OutputArray dst, int code, int dstCn = 0 );

/** @brief Converts an image from one color space to another where the source image is
stored in two planes.

This function only supports YUV420 to RGB conversion as of now.

@param src1: 8-bit image (#CV_8U) of the Y plane.
@param src2: image containing interleaved U/V plane.
@param dst: output image.
@param code: Specifies the type of conversion. It can take any of the following values:
- #COLOR_YUV2BGR_NV12
- #COLOR_YUV2RGB_NV12
- #COLOR_YUV2BGRA_NV12
- #COLOR_YUV2RGBA_NV12
- #COLOR_YUV2BGR_NV21
- #COLOR_YUV2RGB_NV21
- #COLOR_YUV2BGRA_NV21
- #COLOR_YUV2RGBA_NV21
*/

2.4 图像的几何变换

cv::resize

resize：

src：输入图像。
dst：输出图像。当dsize不为0时，输出图像与输入图像的尺寸相同。
dsize：输出图像尺寸。如果dsize为0，则由下式计算：

dsize = cv::Size(round(fxsrc.cols),round(fysrc.rows))
fx和fy两组参数不能全为0 。
fx：水平轴缩放比例因子。如果fx=0，则由下式计算：

(double)dsize.width/src.cols
fy：垂直轴缩放比例因子。如果fy=0，则由下式计算：

(double)dsize.height/src.rows
interpolation：插值算法。

CV_EXPORTS_W void resize( InputArray src, OutputArray dst,
                          Size dsize, double fx = 0, double fy = 0,
                          int interpolation = INTER_LINEAR );

cv::warpAffine

src：输入图像。
dst：输出图像，与输入图像的尺寸和类型相同。
M：2×3变换矩阵。
dsize：输出图像尺寸。
flags ：组合标志位。插值算法和可选标志表示M是（dst→src）逆变换。
borderMode：像素外推方法。旋转、平移和剪切等变换会造成目标图像中的某些区域没有像素值，此时需要用到像素外推方法（又称为边界填充方法）来填补这些“空白”区域。
当borderMode = BORDER_TRANSPARENT时，意味着此函数不会修改与源图像中“域外值”对应的目标图像中的像素。
borderValue：在边界为恒定值的情况下使用的数值，默认为0。

CV_EXPORTS_W void warpAffine( InputArray src, OutputArray dst,
                              InputArray M, Size dsize,
                              int flags = INTER_LINEAR,
                              int borderMode = BORDER_CONSTANT,
                              const Scalar& borderValue = Scalar());

2.5 直方图均衡化

直方图是多种空间域处理的基础，直方图均衡化能有效地提升图像的视觉效果。

calcHist:

images：源图像组的指针。可以是多幅图像，所有图像必须有相同的深度（如CV_8U、CV_16U、CV_32F）和尺寸。同一幅图像可以有多个通道。
nimages：源图像组中的图像个数。
channels：用于计算直方图的dims（函数参数之一）个通道列表。第一个数组通道的编号从0到images[0].channels()−1，第二个数组通道的编号从 images[0].channels() 到images[0].channels()+images[1].channels()−1，依此类推。
mask：可选掩码。如果矩阵不为空，则它必须是与images[i]大小相同的8位数组。非零掩码的位置可用来标记要计算直方图的位置。
hist：输出直方图，是一个密集或稀疏的dims维数组。
dims：直方图维度，必须为正且不大于CV_MAX_DIMS（在当前OpenCV版本中等于32）。
histSize：每个维度中的直方图数组大小。
ranges：直方图每一维的数据统计范围。
uniform：直方图是否均匀的标志位。
accumulate：积累计算标志位。如果已设置，则直方图在分配时不会在开头清除。此功能可以从多组数组中计算单个直方图，或者及时更新直方图。

CV_EXPORTS void calcHist( const Mat* images, int nimages,
                          const int* channels, InputArray mask,
                          OutputArray hist, int dims, const int* histSize,
                          const float** ranges, bool uniform = true, bool accumulate = false );

equalizeHist

自动完成直方图的处理

CV_EXPORTS_W void equalizeHist( InputArray src, OutputArray dst );

/** @brief Creates a smart pointer to a cv::CLAHE class and initializes it.

@param clipLimit Threshold for contrast limiting.
@param tileGridSize Size of grid for histogram equalization. Input image will be divided into
equally sized rectangular tiles. tileGridSize defines the number of tiles in row and column.
 */