Mat - The Basic Image Container

A matrix of the mirror of a car


he first thing you need to be familiar with is how OpenCV stores and handles images.



Mat

C interface  store the image in the memory  used a C structure calledIplImage.  (openCV 1.x)The problem with this is that it brings to the table all the minuses of the C language. The biggest issue is the manual memory management.


Luckily C++ came around and introduced the concept of classes making easier for the user through automatic memory management (more or less). The good news is that C++ is fully compatible with C so no compatibility issues can arise from making the change.


The main downside of the C++ interface is that many embedded development systems at the moment support only C. Therefore, unless you are targeting embedded platforms, there’s no point to using theold methods (unless you’re a masochist programmer and you’re asking for trouble).


The first thing you need to know about Mat is that you no longer need to manually allocate its memory and release it as soon as you do not need it. While doing this is still a possibility, most of the OpenCV functions will allocate its output data automatically. 



Mat is basically a class with two data parts: the matrix header (containing information such as the size of the matrix, the method used for storing, at which address is the matrix stored, and so on) and a pointer to the matrix containing the pixel values (taking any dimensionality depending on the method chosen for storing) . The matrix header size is constant, however the size of the matrix itself may vary from image to image and usually is larger by orders of magnitude.


OpenCV uses a reference counting system. The idea is that each Mat object has its own header, however the matrix may be shared between two instance of them by having their matrix pointers point to the same address. Moreover, the copy operators will only copy the headers and the pointer to the large matrix, not the data itself.


Mat A, C;                                 // creates just the header parts
A = imread(argv[1], CV_LOAD_IMAGE_COLOR); // here we'll know the method used (allocate matrix)

Mat B(A);                                 // Use the copy constructor

C = A;                                    // Assignment operator

不管是copy constructor ,or Assigment operator 都只是copy/assigment the header to the new Object,but the Matrix data will be shared.

All the above objects, in the end, point to the same single data matrix. Their headers are different, 


 The real interesting part is that you can create headers which refer to only a subsection of the full data. For example, to create a region of interest (ROI) in an image you just create a new header with the new boundaries:


Mat D (A, Rect(10, 10, 100, 100) ); // using a rectangle
Mat E = A(Range:all(), Range(1,3)); // using row and column boundaries ???方法是这样调用的妈? 
 



Now you may ask if the matrix itself may belong to multiple Mat objects who takes responsibility for cleaning it up when it’s no longer needed. The short answer is: the last object that used it. This is handled by using a reference counting mechanism. Whenever somebody copies a header of a Mat object, a counter is increased for the matrix. Whenever a header is cleaned this counter is decreased. When the counter reaches zero the matrix too is freed.


Sometimes you will want to copy the matrix itself too, so OpenCV provides the clone() and copyTo() functions.


Mat F = A.clone();
Mat G;
A.copyTo(G);


summarize:

  • Output image allocation for OpenCV functions is automatic (unless specified otherwise).
  • You do not need to think about memory management with OpenCVs C++ interface.
  • The assignment operator and the copy constructor only copies the header.
  • The underlying matrix of an image may be copied using the clone() and copyTo() functions.


Storing methods


 color space and the data type ,The color space refers to how we combine color components in order to code a given color. The simplest one is the gray scale where the colors at our disposal are black and white. The combination of these allows us to create many shades of gray.


For colorful ways we have a lot more methods to choose from. Each of them breaks it down to three or four basic components and we can use the combination of these to create the others.The most popular one is RGB,   To code the transparency of a color sometimes a fourth element: alpha (A) is added.


There are, however, many other color systems each with their own advantages:

  • RGB is the most common as our eyes use something similar, our display systems also compose colors using these.
  • The HSV and HLS decompose colors into their hue, saturation and value/luminance components, which is a more natural way for us to describe colors. You might, for example, dismiss the last component, making your algorithm less sensible to the light conditions of the input image.
  • YCrCb is used by the popular JPEG image format.
  • CIE L*a*b* is a perceptually uniform color space, which comes handy if you need to measure thedistance of a given color to another color.

Each of the building components has their own valid domains.

 

The smallest data type possible is char, which means one byte or 8 bits. This may be unsigned (so can store values from 0 to 255) or signed (values from -127 to +127).

Although in case of three components this already gives 16 million possible colors to represent (like in case of RGB) we may acquire an even finer control by using the float (4 byte = 32 bit) or double (8 byte = 64 bit) data types for each component.


Creating a Mat object explicitly


for debugging purposes  to see the actual values  using the << operator ofMat. Be aware that this only works for two dimensional matrices.


Although Mat works really well as an image container, it is also a general matrix class. Therefore, it is possible to create and manipulate multidimensional matrices. You can create a Mat object in multiple ways:

  • Mat() Constructor

        Mat M(2,2, CV_8UC3, Scalar(0,0,255));
        cout << "M = " << endl << " " << M << endl << endl;
    


Demo image of the matrix output


应该把0,0,255 看成一个整体,那么 2*2 的 matrix is

a,b

c,d   the demision is 2 这样才能想象的到下面所创建的三维空间。


For two dimensional and multichannel images we first define their size: row and column count wise. (2,2)

Then we need to specify the data type to use for storing the elements and the number of channels per matrix point.To do this we have multiple definitions constructed according to the following convention:

CV_[The number of bits per item][Signed or Unsigned][Type Prefix]C[The channel number]

For instance, CV_8UC3 means we use unsigned char types that are 8 bit long and each pixel has three of these to form the three channels. This are predefined for up to four channel numbers. TheScalar is four element short vector. Specify this and you can initialize all matrix points with a custom value.

ie:Scalar(0,0,255) 看图片就知道了,那三个channel 每一点的 value is 0 B,0G  255 R


    Mat image(3,4,CV_8UC3,Scalar(0,0,255)); //row, column
    cout << "image = " << endl << " " << image << endl << endl;
    imwrite(filepathSave,image);
  cout<<(image.depth()==CV_8U)<<endl; //depth
    cout<<image.channels()<<endl;  //3


image =
 [0, 0, 255, 0, 0, 255, 0, 0, 255, 0, 0, 255;
  0, 0, 255, 0, 0, 255, 0, 0, 255, 0, 0, 255;
  0, 0, 255, 0, 0, 255, 0, 0, 255, 0, 0, 255]

把[0,0,255] 看成一个整体,一个点,Mat (3*4 ) 就有12 个点,由于是BGR顺序,所以写到文件上是一红色图片

image.data 是一个一维数组。。

C++:Mat::Mat(introws, int cols, int type, const Scalar&s) create a 2 demisional matrix.

Mat (3*4 ) 虽然是3*4 但仍然是2 维的, 只有row and width 这两个维度,我们可以想像成就一个2D plane. see below

how to create a 3 dimesional matrix.


If you need more you can create the type with the upper macro, setting the channel number in parenthesis as you can see below.


C++: Mat::Mat(intndims, const int* sizes, int type, const Scalar&s) create Ndimsional matrix.

Use C\C++ arrays and initialize via constructor

    int sz[3] = {2,2,2};
    Mat L(3,sz, CV_8UC(1), Scalar::all(0));


The upper example shows how to create a matrix with more than two dimensions.Specify its dimension, then pass a pointer containing the size for each dimension and the rest remains the same,the rest part其实是用来确定一个pixel point color information 了,前面的就是用来定位一个pixel point 的.

比如3 维,我们需要3个纬度才能确定一个pixel. 确定一个pixel 后,他的color 由什么component 组成, data type, default value ,etc.....


imshow/imwrite 一个>=3 dimetional 的图片会报错的,目前还不清楚高维度有什么作用,猜想video capture, time 作为第三维 ,,,,




  • Create a header for an already existing IplImage pointer:

    IplImage* img = cvLoadImage("greatwave.png", 1);
    Mat mtx(img); // convert IplImage* -> Mat 
    


mtx's data pointer is the same as img's data pointer. so just create a new header ,the content not created.



  • Create() function:

        M.create(4,4, CV_8UC(2));
        cout << "M = "<< endl << " "  << M << endl << endl;
    
Demo image of the matrix output

You cannot initialize the matrix values with this construction all init to 205,no control .It will only reallocate its matrix data memory if the new size will not fit into the old one.


Mat image(3,4,CV_8UC(3),Scalar(0,200,0));
    cout << "image = " << endl << " " << image << endl << endl;
    image.create(4,4,CV_8UC3); --create is not a static method,it will reallocate the memoery of the image

     cout << "image = " << endl << " " << image << endl << endl;



image =
 [0, 200, 0, 0, 200, 0, 0, 200, 0, 0, 200, 0;
  0, 200, 0, 0, 200, 0, 0, 200, 0, 0, 200, 0;
  0, 200, 0, 0, 200, 0, 0, 200, 0, 0, 200, 0]

image =
 [205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205;
  205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205;
  205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205;
  205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205, 205]




  • MATLAB style initializer: zeros(), ones(), eye(). Specify size and data type to use:

        Mat E = Mat::eye(4, 4, CV_64F);
        cout << "E = " << endl << " " << E << endl << endl;
    
        Mat O = Mat::ones(2, 2, CV_32F);
        cout << "O = " << endl << " " << O << endl << endl;
    
        Mat Z = Mat::zeros(3,3, CV_8UC1);
        cout << "Z = " << endl << " " << Z << endl << endl;
    
Demo image of the matrix output



  • For small matrices you may use comma separated initializers:

        Mat C = (Mat_<double>(3,3) << 0, -1, 0, -1, 5, -1, 0, -1, 0);
        cout << "C = " << endl << " " << C << endl << endl;
    
Demo image of the matrix output


class Mat_

Template matrix class derived fromMat .

template<typename _Tp> class Mat_ : public Mat
{
public:
    // ... some specific methods
    //         and
    // no new extra fields
};

    Mat c=(Mat_<float>(3,3)<<0,-1,1,1,0,2,1,3,4);  --对比Mat ,的构造,其实就是模板argument 帮我们制定了type
    Mat b(2,2,CV_32FC(1));




  • Create a new header for an existing Mat object and clone() or copyTo() it.

        Mat RowClone = C.row(1).clone();
        cout << "RowClone = " << endl << " " << RowClone << endl << endl;
    
    Demo image of the matrix output

Note

You can fill out a matrix with random values using the randu() function. You need to give the lower and upper value for the random values:

    Mat R = Mat(3, 2, CV_8UC3);
    randu(R, Scalar::all(0), Scalar::all(255));



Output formatting

  • Default

        cout << "R (default) = " << endl <<        R           << endl << endl;
    
    Default Output


  • Python

        cout << "R (python)  = " << endl << format(R,"python") << endl << endl;
    
    Default Output


  • Comma separated values (CSV)

        cout << "R (csv)     = " << endl << format(R,"csv"   ) << endl << endl;
    
    Default Output

  • Numpy

        cout << "R (numpy)   = " << endl << format(R,"numpy" ) << endl << endl;
    
    Default Output


  • C

        cout << "R (c)       = " << endl << format(R,"C"     ) << endl << endl;
    
    Default Output


Output of other common items

OpenCV offers support for output of other common OpenCV data structures too via the << operator:

  • 2D Point

        Point2f P(5, 1);
        cout << "Point (2D) = " << P << endl << endl;
    
    Default Output


  • 3D Point

        Point3f P3f(2, 6, 7);
        cout << "Point (3D) = " << P3f << endl << endl;
    
    Default Output


  • std::vector via cv::Mat

        vector<float> v;
        v.push_back( (float)CV_PI);   v.push_back(2);    v.push_back(3.01f);
    
        cout << "Vector of floats via Mat = " << Mat(v) << endl << endl;
    
    Default Output



  • std::vector of points

        vector<Point2f> vPoints(20);
        for (size_t i = 0; i < vPoints.size(); ++i)
            vPoints[i] = Point2f((float)(i * 5), (float)(i % 7));
    
        cout << "A vector of 2D Points = " << vPoints << endl << endl;
    
    Default Output

Most of the samples here have been included in a small console application. You can download it from here or in the core section of the cpp samples.






















深度学习是机器学习的一个子领域,它基于人工神经网络的研究,特别是利用多层次的神经网络来进行学习和模式识别。深度学习模型能够学习数据的高层次特征,这些特征对于图像和语音识别、自然语言处理、医学图像分析等应用至关重要。以下是深度学习的一些关键概念和组成部分: 1. **神经网络(Neural Networks)**:深度学习的基础是人工神经网络,它是由多个层组成的网络结构,包括输入层、隐藏层和输出层。每个层由多个神经元组成,神经元之间通过权重连接。 2. **前馈神经网络(Feedforward Neural Networks)**:这是最常见的神经网络类型,信息从输入层流向隐藏层,最终到达输出层。 3. **卷积神经网络(Convolutional Neural Networks, CNNs)**:这种网络特别适合处理具有网格结构的数据,如图像。它们使用卷积层来提取图像的特征。 4. **循环神经网络(Recurrent Neural Networks, RNNs)**:这种网络能够处理序列数据,如时间序列或自然语言,因为它们具有记忆功能,能够捕捉数据中的时间依赖性。 5. **长短期记忆网络(Long Short-Term Memory, LSTM)**:LSTM 是一种特殊的 RNN,它能够学习长期依赖关系,非常适合复杂的序列预测任务。 6. **生成对抗网络(Generative Adversarial Networks, GANs)**:由两个网络组成,一个生成器和一个判别器,它们相互竞争,生成器生成数据,判别器评估数据的真实性。 7. **深度学习框架**:如 TensorFlow、Keras、PyTorch 等,这些框架提供了构建、训练和部署深度学习模型的工具和库。 8. **激活函数(Activation Functions)**:如 ReLU、Sigmoid、Tanh 等,它们在神经网络中用于添加非线性,使得网络能够学习复杂的函数。 9. **损失函数(Loss Functions)**:用于评估模型的预测与真实值之间的差异,常见的损失函数包括均方误差(MSE)、交叉熵(Cross-Entropy)等。 10. **优化算法(Optimization Algorithms)**:如梯度下降(Gradient Descent)、随机梯度下降(SGD)、Adam 等,用于更新网络权重,以最小化损失函数。 11. **正则化(Regularization)**:技术如 Dropout、L1/L2 正则化等,用于防止模型过拟合。 12. **迁移学习(Transfer Learning)**:利用在一个任务上训练好的模型来提高另一个相关任务的性能。 深度学习在许多领域都取得了显著的成就,但它也面临着一些挑战,如对大量数据的依赖、模型的解释性差、计算资源消耗大等。研究人员正在不断探索新的方法来解决这些问题。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值