OpenCV文档Mat类Detailed Description翻译


在网络上有很多浏览了众多Mat类的文章,但最终求真还是回到了官方文档。因此,本人特此将OpenCV文档中Mat类Detailed Description部分进行翻译以享,并特意添加了一些个人理解,见"注:"。




n-dimensional dense array class


The class Mat represents an n-dimensional dense numerical single-channel or multi-channel array. It can be used to store real or complex-valued vectors and matrices, grayscale or color images, voxel volumes, vector fields, point clouds, tensors, histograms (though, very high-dimensional histograms may be better stored in a SparseMat ). The data layout of the array M is defined by the array M.step[], so that the address of element ( i 0 , . . . , i M . d i m s − 1 ) (i_{0},...,i_{M.dims−1}) (i0,...,iM.dims1), where 0 ≤ i k < M . s i z e [ k ] 0≤i_{k}<M.size[k] 0ik<M.size[k], is computed as:

Mat类表示一个n维稠密单通道/多通道数组。该类可用于存储实数、复数、向量、矩阵、灰度彩色图像、voxel volumes、向量场、点云、张量、直方图(高维直方图使用SparseMat存储更佳)。数组M的结构由另一个数组M.step[]表达,对于坐标为 ( i 0 , . . . , i M . d i m s − 1 ) (i_{0},...,i_{M.dims−1}) (i0,...,iM.dims1)的元素, 0 ≤ i k < M . s i z e [ k ] 0≤i_{k}<M.size[k] 0ik<M.size[k],由如下公式计算:

- 对于图像,相对于稀疏,稠密数组中每一个元素访问坐标直接代表其现实位置,对于稀疏数组,坐标则需要单独存储。

a d d r ( M i 0 , . . . , i M . d i m s − 1 ) = M . d a t a + M . s t e p [ 0 ] ∗ i 0 + M . s t e p [ 1 ] ∗ i 1 + . . . + M . s t e p [ M . d i m s − 1 ] ∗ i + M . d i m s − 1 addr(M_{i_{0},...,i_{M.dims−1}})[0]∗i_{0}+M.step[1]∗i_{1}+...+M.step[M.dims−1]∗i+{M.dims−1} addr(Mi0,...,iM.dims1)[0]i0+M.step[1]i1+...+M.step[M.dims1]i+M.dims1

In case of a 2-dimensional array, the above formula is reduced to:


a d d r ( M i , j ) = M . d a t a + M . s t e p [ 0 ] ∗ i + M . s t e p [ 1 ] ∗ j addr(M_{i,j})[0]∗i+M.step[1]∗j addr(Mi,j)[0]i+M.step[1]j

- 二维数组中元素地址,或称为灰度图中像素的地址。其地址可以表示为,首元素地址 + 偏移M.step[0]∗i+M.step[1]∗j

Note that M . s t e p [ i ] > = M . s t e p [ i + 1 ] M.step[i] >= M.step[i+1] M.step[i]>=M.step[i+1] (in fact, M . s t e p [ i ] > = M . s t e p [ i + 1 ] ∗ M . s i z e [ i + 1 ] M.step[i] >= M.step[i+1]*M.size[i+1] M.step[i]>=M.step[i+1]M.size[i+1] ). This means that 2-dimensional matrices are stored row-by-row, 3-dimensional matrices are stored plane-by-plane, and so on. M.step[M.dims-1] is minimal and always equal to the element size M.elemSize().

注意, M . s t e p [ i ] > = M . s t e p [ i + 1 ] M.step[i] >= M.step[i+1] M.step[i]>=M.step[i+1](实际上, M . s t e p [ i ] > = M . s t e p [ i + 1 ] ∗ M . s i z e [ i + 1 ] M.step[i] >= M.step[i+1]*M.size[i+1] M.step[i]>=M.step[i+1]M.size[i+1])。这意味着,对于二维数组是逐个行存储的,对于三维数组则是逐个平面存储,以此类推。M.step[M.dims-1]是数组中元素跨度的最小单位,也等于数组中元素的字节大小M.elemSize()


  • row-by-row与plane-by-plane的理解:对于二维数组即灰度图,Mat结构是按照行来存储的,即数组形状为[Row, Col];对于三维数组即彩色图,Mat结构中彩色图像中每一个像素的各通道是连续存储的,即数组形状为[Row, Col, Channel],在Pytorch中一般是[Channel, Row, Col],这也是为何cv::split较为耗时的原因。
  • 对step的理解:注意step与shape的不同,step是为了方便计算数组坐标的地址,shape是为了方便获取数组形状。

So, the data layout in Mat is compatible with the majority of dense array types from the standard toolkits and SDKs, such as Numpy (ndarray), Win32 (independent device bitmaps), and others, that is, with any array that uses steps (or strides) to compute the position of a pixel. Due to this compatibility, it is possible to make a Mat header for user-allocated data and process it in-place using OpenCV functions.



  • 本人将header翻译为类头。我们可以简单的将Mat分解为类头和其指向的数据段,类头则是一些结构信息,即Mat类成员变量;数据段存储则是真正的数据,如图像像素,类头通过一些成员变量指向了该段数据。
  • in-place,表示原位操作,即对此Mat的操作既是对用户数据(user-allocated data)的操作。
  • user-allocated data,用户数据,即非使用Mat成员函数或构造函数创建的数据由外部用户传入的数据,比如相机采集的数据。


There are many different ways to create a Mat object. The most popular options are listed below:

  • Use the create(nrows, ncols, type) method or the similar Mat(nrows, ncols, type[, fillValue]) constructor. A new array of the specified size and type is allocated. type has the same meaning as in the cvCreateMat method. For example, CV_8UC1 means a 8-bit single-channel array, CV_32FC2 means a 2-channel (complex) floating-point array, and so on.

    使用成员函数create(nrows, ncols, type)或者如Mat(nrows, ncols, type[, fillValue])的构造函数,可以创建指定大小和数据类型的数组。数据类型(type)与cvCreateMat中的一致。比如CV_8UC1表示8比特单通道数组,CV_32FC2表示双通道(或者为复数)的浮点(32比特float)数组,等等。


    • cvCreateMat是老式C风格数组cvMat的成员函数,现在已经弃用。
    // make a 7x7 complex matrix filled with 1+3j.
    // 创建一个7x7大小的复数矩阵,矩阵元素值全部为1+3j。
    Mat M(7,7,CV_32FC2,Scalar(1,3));
    // and now turn M to a 100x60 15-channel 8-bit matrix.
    // The old content will be deallocated
    // 将M转变为另一个100x60有15个通道的数据类型为8比特的矩阵。

    As noted in the introduction to this chapter, create() allocates only a new array when the shape or type of the current array are different from the specified ones.


  • Create a multi-dimensional array:

    // create a 100x100x100 8-bit array
    // 创建一个大小为100x100x100,数据类型为8比特的数组
    int sz[] = {100, 100, 100};
    Mat bigCube(3, sz, CV_8U, Scalar::all(0));

    It passes the number of dimensions =1 to the Mat constructor but the created array will be 2-dimensional with the number of columns set to 1. So, Mat::dims is always >= 2 (can also be 0 when the array is empty).

    当传给Mat的构造函数的维度 =1时,实际创建的数组将是一个2维数组,其列数为1。因此Mat::dims总是>= 2(数组为空时,维度为0)。

  • Use a copy constructor or assignment operator where there can be an array or expression on the right side (see below). As noted in the introduction, the array assignment is an O(1) operation because it only copies the header and increases the reference counter. The Mat::clone() method can be used to get a full (deep) copy of the array when you need it.



    • 并不清楚原文see below指的是哪里。
    • OpenCV使用基于引用计数的智能指针管理数组中的数据。这里涉及到深浅拷贝的问题,为了保证高效,Mat中的拷贝构造和赋值操作是浅拷贝。
  • Construct a header for a part of another array. It can be a single row, single column, several rows, several columns, rectangular region in the array (called a minor in algebra) or a diagonal. Such operations are also O(1) because the new header references the same data. You can actually modify a part of the array using this feature, for example:


    // add the 5-th row, multiplied by 3 to the 3rd row
    // 第5行乘3,加到第三行。
    M.row(3) = M.row(3) + M.row(5)*3;
    // now copy the 7-th column to the 1-st column
    // M.col(1) = M.col(7); // this will not work
    // 拷贝第七列至第1列。
    // M.col(1) = M.col(7); 该方法不正确
    Mat M1 = M.col(1);
    // create a new 320x240 image
    // 创建一个新的大小为320x240的图像
    Mat img(Size(320,240),CV_8UC3);
    // select a ROI
    // 选择一个ROI
    Mat roi(img, Rect(10,10,100,100));
    // fill the ROI with (0,255,0) (which is green in RGB space);
    // the original 320x240 image will be modified
    // 使用像素值(0,255,0)(在RBG空间指绿色)填充ROI。
    // 则原始的320x240图像也将被改变。
    roi = Scalar(0,255,0);


    • 本段中header类头,可理解为Mat对象。

    Due to the additional datastart and dataend members, it is possible to compute a relative sub-array position in the main container array using locateROI():


    Mat A = Mat::eye(10, 10, CV_32S);
    // extracts A columns, 1 (inclusive) to 3 (exclusive).
    // 提取A第1列(包含第1列)到第3列(不包含第3列)
    Mat B = A(Range::all(), Range(1, 3));
    // extracts B rows, 5 (inclusive) to 9 (exclusive).
    // that is, C \~ A(Range(5, 9), Range(1, 3))
    // 提取B第5行(包含第5行)到第9行(不包含第9行)
    Mat C = B(Range(5, 9), Range::all());
    Size size; Point ofs;
    C.locateROI(size, ofs);
    // size will be (width=10,height=10) and the ofs will be (x=1, y=5) 
    // size是(width=10,height=10),ofs是(x=1, y=5) 。

    As in case of whole matrices, if you need a deep copy, use the clone() method of the extracted sub-matrices.



    • 原文中并没有详细区分 matrices、arrays、header或pixel、element等,本人着实翻译并没有统一。
  • Make a header for user-allocated data. It can be useful to do the following:


    1. Process “foreign” data using OpenCV (for example, when you implement a DirectShow* filter or a processing module for gstreamer, and so on). For example:


      void process_video_frame(const unsigned char* pixels,
                          int width, int height, int step)
          Mat img(height, width, CV_8UC3, pixels, step);
          GaussianBlur(img, img, Size(7,7), 1.5, 1.5);
    2. Quickly initialize small matrices and/or get a super-fast element access.


      double m[3][3] = {{a, b, c}, {d, e, f}, {g, h, i}};
      Mat M = Mat(3, 3, CV_64F, m).inv();
  • Use MATLAB-style array initializers, zeros(), ones(), eye(), for example:


    // create a double-precision identity matrix and add it to M.
    // 创建一个双精度单位矩阵并加至M
    M += Mat::eye(M.rows, M.cols, CV_64F);
  • Use a comma-separated initializer


    // create a 3x3 double-precision identity matrix
    // 创建一个 3x3 的双精度单位矩阵
    Mat M = (Mat_<double>(3,3) << 1, 0, 0, 0, 1, 0, 0, 0, 1);

    With this approach, you first call a constructor of the Mat class with the proper parameters, and then you just put << operator followed by comma-separated values that can be constants, variables, expressions, and so on. Also, note the extra parentheses required to avoid compilation errors.


Once the array is created, it is automatically managed via a reference-counting mechanism. If the array header is built on top of user-allocated data, you should handle the data by yourself. The array data is deallocated when no one points to it. If you want to release the data pointed by a array header before the array destructor is called, use Mat::release().

一旦数组被创建,它将通过引用计数机制自动管理。当数组类头由用户以上数据(top of user-allocated data)创建时,你需要自行管理数据。当没有类头指向该数据时将被析构。如果你想在析构函数被调用前手动释放数组类头指向的数据,使用Mat::release()


  • top of user-allocated data,本人理解为用户数据或者其他非通过Mat类创建的数据。如果Mat将此数据析构,将可能带来野指针问题。


The next important thing to learn about the array class is element access. This manual already described how to compute an address of each array element. Normally, you are not required to use the formula directly in the code. If you know the array element type (which can be retrieved using the method Mat::type() ), you can access the element M i j M_{ij} Mij of a 2-dimensional array as:

下一个重要的事情则是学习在数组类如何访问元素。本文上面已经表述了如何计算每一个元素的地址。一般来说,你并不需要在代码中使用上面的公式。如果你想知道数组元素的类型(可以使用成员函数Mat::type()),在二维数组中,你可以使用如下公式获取元素 M i j M_{ij} Mij<double>(i,j) += 1.f;

assuming that M is a double-precision floating-point array. There are several variants of the method at for a different number of dimensions.


If you need to process a whole row of a 2D array, the most efficient way is to get the pointer to the row first, and then just use the plain C operator [] :


// compute sum of positive matrix elements
// 计算元素中所有正数元素的和
// (assuming that M is a double-precision matrix)
// 假设 M是一个双精度浮点矩阵
double sum=0;
for(int i = 0; i < M.rows; i++)
    const double* Mi = M.ptr<double>(i);
    for(int j = 0; j < M.cols; j++)
        sum += std::max(Mi[j], 0.);

Some operations, like the one above, do not actually depend on the array shape. They just process elements of an array one by one (or elements from multiple arrays that have the same coordinates, for example, array addition). Such operations are called element-wise. It makes sense to check whether all the input/output arrays are continuous, namely, have no gaps at the end of each row. If yes, process them as a long single row:



  • 对于连续的(continuous)的n维数组,由于真实内存是一维的,因此在内存中可以视为一维数组。可以逐一快速进行计算,因此判断数组是否是continuous的是十分重要的,在其他矩阵库中也有类似判断。
// compute the sum of positive matrix elements, optimized variant
// 计算矩阵中正数元素的和,优化变体
double sum=0;
int cols = M.cols, rows = M.rows;
    cols *= rows;
    rows = 1;
for(int i = 0; i < rows; i++)
    const double* Mi = M.ptr<double>(i);
    for(int j = 0; j < cols; j++)
        sum += std::max(Mi[j], 0.);

In case of the continuous matrix, the outer loop body is executed just once. So, the overhead is smaller, which is especially noticeable in case of small matrices.


Finally, there are STL-style iterators that are smart enough to skip gaps between successive rows:


// compute sum of positive matrix elements, iterator-based variant
// 计算矩阵中正数元素的和,基于迭代器的变体
double sum=0;
MatConstIterator_<double> it = M.begin<double>(), it_end = M.end<double>();
for(; it != it_end; ++it)
    sum += std::max(*it, 0.);

The matrix iterators are random-access iterators, so they can be passed to any STL algorithm, including std::sort().


