图像存储方式及使用OpenCV简单处理图像-CSDN博客

本文链接：https://blog.csdn.net/sinat_40658206/article/details/116165833

本文详细介绍了图像的YUV存储方式，包括planar和packed两种存储形式，以及4:4:4、4:2:2、4:2:0三种采样方式。此外，还讲解了RGB色彩模式及其常见的存储格式。YUV到RGB和RGB到YUV的转换公式也被提及。最后，文章讨论了OpenCV中处理图像的基本操作，如矩阵、图像读取保存、调整大小、旋转和裁剪，提供了相应的函数和示例代码。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一.目的

简述图像存储方式及使用OpenCV简单处理图像。

二.图像存储方式

2.1.YUV

2.1.1.简介

YUV是编译true-color颜色空间（color space）的种类，Y’UV, YUV, YCbCr，YPbPr等专有名词都可以称为YUV，彼此有重叠。“Y”表示明亮度（Luminance或Luma），也就是灰阶值，“U”和“V”表示的则是色度（Chrominance或Chroma），作用是描述影像色彩及饱和度，用于指定像素的颜色。

2.1.2.宏观存储方式

在宏观上，YUV有两种存储方式：planar、packed。

planar：从字面意思上来看，planar是平面的意思，平面比较平整，对应到存储方式上就是把YUV三种分量分别存储，以I420为例，存储方式为：YYYYYYYYUUVV，简单明了，先把Y存完，再存U，再存V，这种在解析时很方便
packed：从字面意思来看，packed是打包的意思，打包就不一定是平整的了，对应到存储方式上就是把YUV三种分量交叉存储，以YUY2为例，存储方式为：Y0U0Y1V0 Y2U1Y3V1，这种方式在解析时就会比较麻烦

2.1.3.采样方式

采样方式即如何表现一个像素，现在主流的有三种采样方式，4:4:4、4:2:2、4:2:0，这三种比例是YUV三种分量的比例，前面说过像素是用YUV三个分量控制显示的，所以一个像素应该包含一个Y，一个U，一个V，如果要完全存储，那一个一个像素点就要存储YUV三个分量，这种形式就是4:4:4了。但是因为人的眼睛对色度和饱和度不是特别敏感，所以一定程度上丢失一部分UV并不影响我们分辨颜色，所以为了节省存储空间，在存储时就故意丢掉部分UV分量，用两个Y分量共用一组UV分量，这种形式就是4:2:2,或用四个Y分量共用一组UV，这种形式就是4:2:0了。

在存储时YUV各占一个字节Byte，各方式存储256X256分辨率的图片是占用大小如下：
4:4:4方式要占用256×256×3=196608Byte
4:2:2方式要占用256×256×2=131072Byte，
4:2:0方式要占用256×256×2/3=43690.7Byte。

2.1.4.YUV存储格式

YU12/I420、YV12采样方式都是4:2:0的，因为都是planar方式存储，简称420p。
NV12、NV21虽然也是4:2:0类型，但是并不是完全的planar格式，所以又称为420sp，与420p进行区分。

2.1.4.1.YU12/I420

存储方式为先存储把全部的Y分量存完，再存U分量，最后存V分量。

YYYYYYYYUUVV

2.1.4.2.YV12

该格式与YU12基本一样，唯一的区别是先存储V分量再存储U分量。

YYYYYYYYVVUU

2.1.4.3.NV12

该格式是先存储全部的Y分量，然后UV分量交叉存储。
YYYYYYYYUVUV

2.1.4.4.NV21

该格式与NV21的区别和上面YU12/YV12一样，唯一的区别只是UV分量交叉的顺序不同，NV12是U排前面，NV21是V排前面。

YYYYYYYYVUVU

2.1.4.5.YUV422P

名字中带P表示是planar格式存储，该格式存储方式与I420是一样的，唯一的区别是UV分量的数量不同，I420中四个Y共用一组UV，而该格式中两个Y共用一组UV，也就是说UV分量相对于I420在数量上多了一倍。

YYYYYYYYUUUUVVVV

2.1.4.6.YUYV/YUY2

该格式属于4:2:2类型，且是用packed形式存储的。

Y0U0Y1V0 Y2U1Y3V1

2.1.4.7.YVYU

该格式与YUYV相似，只是存储时UV分量顺序不同而已，为YVYU。

Y0V0Y1U0 Y2V1Y3U1

2.1.4.8.UYVY

该格式也是4：2：2类型，与上面两种方式并无大的不同。

U0Y0V0Y1 U1Y2V1Y3

2.2.RGB

2.2.1.简介

RGB色彩模式（也翻译为“红绿蓝”，比较少用）是工业界的一种颜色标准，是通过对红(Red)、绿(Green)、蓝(Blue)三个颜色通道的变化以及它们相互之间的叠加来得到各式各样的颜色的，RGB即是代表红、绿、蓝三个通道的颜色，这个标准几乎包括了人类视力所能感知的所有颜色，是目前运用最广的颜色系统之一。

2.2.2.RGB存储格式

2.2.2.1.RGB16格式

RGB565图像每个像素用16比特位表示，占2个字节，RGB分量分别使用5位、6位、5位。
RGB555图像每个像素用16比特位表示，占2个字节，RGB分量都使用5位(最高位不用)。

2.2.2.2.RGB24

RGB24图像每个像素用24比特位表示，占3个字节，注意：在内存中RGB各分量的排列顺序为：BGR BGR BGR …。

2.2.2.3.RGB32

RGB32图像每个像素用32比特位表示，占4个字节，RGB分量分别用8个bit表示，存储顺序为B，G，R，最后8个字节保留。注意：在内存中RGB各分量的排列顺序为：BGRA BGRA BGRA ......。

ARGB32本质就是带alpha通道的RGB24，与RGB32的区别在与，保留的8个bit用来表示透明，也就是alpha的值。

2.3.YUV与RGB格式转换

如果把RGB和YUV的范围都放缩到[0,255]，那么常用的转换公式如下：

RGB -> YUV：
Y = 0.299R + 0.587G + 0.114B
Cb = U = -0.169R - 0.331G + 0.500B + 128
Cr = V = 0.500R - 0.419G - 0.081B + 128

YUV -> RGB
R = Y + 1.403 (V-128)
G = Y - 0.343 (U-128) - 0.714 (V-128)
B = Y + 1.770 (U-128)

YUV420p to RGB24
此处RGB24 RGB分量顺序为 RGB RGB … 适应yuvplayer程序

YUV420p图片大小：width * height * 3 / 2；RGB24图片大小：width * height * 2；所以yuv420压缩方式可以减少一半的大小。

c语言示例（参考https://blog.csdn.net/bemy1008/article/details/88766647）：

void YUV4202RGB24(uint8_t* yuv, uint8_t* rgb, size_t height, size_t width)
{
    int size = height * width;
    int y, u, v;
    float r, g, b;
    uint8_t* yFrame = yuv;
    uint8_t* uFrame = yuv + size;
    uint8_t* vFrame = uFrame + size / 4;
    int offset = 0;
    for (int i = 0; i < height; i++)
    {
        for (int j = 0; j < width; j++)
        {
            int yIdx = i * width + j;
            int vIdx = (i / 4) * width + j / 2;
            int uIdx = (i / 4) * width + j / 2;

            int R = (yFrame[yIdx] - 16) + 1.370805 * (vFrame[uIdx] - 128);                                                     // r分量	
            int G = (yFrame[yIdx] - 16) - 0.69825 * (vFrame[vIdx] - 128) - 0.33557 * (uFrame[uIdx] - 128);       // g分量
            int B = (yFrame[yIdx] - 16) + 1.733221 * (uFrame[uIdx] - 128);                                                     // b分量

            R = R < 255 ? R : 255;
            G = G < 255 ? G : 255;
            B = B < 255 ? B : 255;
            R = R < 0 ? 0 : R;
            G = G < 0 ? 0 : G;
            B = B < 0 ? 0 : B;

            rgb[offset++] = (unsigned char)R;
            rgb[offset++] = (unsigned char)G;
            rgb[offset++] = (unsigned char)B;
        }
    }
}

另一种实现：

void YUV4202RGB24(uint8_t* yuv, uint8_t* rgb, size_t height, size_t width)
{
	for (int w = 0; w < width; ++w) {
	    for (int h = 0; h < height; ++h) {
	        y = yFrame[w + width * h];
	        u = uFrame[width * (h / 4) + w / 2];
	        v = vFrame[width * (h / 4) + w / 2];
	
	        b = y + 1.770 * (u - 128);
	        rgb[(h * width + w) * 3 + 2] = (uint8_t)(b > 255 ? 255 : (b < 0 ? 0 : b));
	
	        g = y - 0.343 * (u - 128) - 0.714 * (v - 128);
	        rgb[(h * width + w) * 3 + 1] = (uint8_t)(g > 255 ? 255 : (g < 0 ? 0 : g));
	
	        r = y + 1.403 * (v - 128);
	        rgb[(h * width + w) * 3] = (uint8_t)(r > 255 ? 255 : (r < 0 ? 0 : r));
	    }
	}
}

三.OpenCV图像处理

仅记录常用的。

3.1.矩阵（Matrix）

了解OpenCV之前需要先了解矩阵的基础知识。
参考百度百科：https://baike.baidu.com/item/%E7%9F%A9%E9%98%B5/18069
###3.1.1.简介###
在数学中，矩阵是一个按照长方阵列排列的复数或实数集合。
由m × n个数aij排成的m行n列的数表称为m行n列的矩阵，简称m × n矩阵。记作：

这m × n个数称为矩阵A的元素，简称为元，数aij位于矩阵A的第i行第j列，称为矩阵A的(i,j)元，以数aij为(i,j)元的矩阵可记为(aij)或(aij)m × n，m×n矩阵A也记作Amn。
元素是实数的矩阵称为实矩阵，元素是复数的矩阵称为复矩阵。而行数与列数都等于n的矩阵称为n阶矩阵或n阶方阵。
###3.1.2.矩阵转置###
把矩阵A的行和列互相交换所产生的矩阵称为A的转置矩阵AT，这一过程称为矩阵的转置。

矩阵的转置满足以下运算律：
(AT)T = A
(λA)T = λAT
(AB)T = BTAT

3.2.图像读取保存

https://docs.opencv.org/2.4/modules/highgui/doc/reading_and_writing_images_and_video.html?highlight=imread#cv2.imread

3.2.1.cv::imread

从文件中读取图像数据。

Mat imread(const string& filename, int flags = IMREAD_COLOR);
Describe:
Loads an image from a file.
Parameters:
	filename - Name of file to be loaded.
	flags - 
		Flags specifying the color type of a loaded image
		1. >0 Return a 3-channel color image.
		2. =0 Return a grayscale image.
		3. <0 Return the loaded image as is (with alpha channel).
Supported format:
	Windows bitmaps - *.bmp, *.dib (always supported)
	JPEG files - *.jpeg, *.jpg, *.jpe (see the Notes section)
	JPEG 2000 files - *.jp2 (see the Notes section)
	Portable Network Graphics - *.png (see the Notes section)
	Portable image format - *.pbm, *.pgm, *.ppm (always supported)
	Sun rasters - *.sr, *.ras (always supported)
	TIFF files - *.tiff, *.tif (see the Notes section)
Returns:
	The function imread loads an image from the specified file and returns it. If the image cannot be read (because of missing file, improper permissions, unsupported or invalid format), the function returns an empty matrix ( Mat::data==NULL )

3.2.2.cv::imdecode

从内存缓存中读取图像数据，适用于读取网络传输中的二进制图像数据。

bool imencode(const string& ext, InputArray img, std::vector<uchar>& buf,const std::vector<int>& params = std::vector<int>());
Describe:
	Encodes an image into a memory buffer.
Parameters:
	ext - File extension that defines the output format.
	img - Image to be written.
	buf - Output buffer resized to fit the compressed image.
	params - Format-specific parameters. See imwrite().

3.2.3.cv::imwrite

保存图像数据到文件。

bool imwrite(const String& filename, InputArray img,const std::vector<int>& params = std::vector<int>());
Describe:
Saves an image to a specified file.
Parameters:
	filename  - Name of the file.
	image - Image to be saved.
	params - 
		For JPEG, it can be a quality ( CV_IMWRITE_JPEG_QUALITY ) from 0 to 100 (the higher is the better). Default value is 95.
		For PNG, it can be the compression level ( CV_IMWRITE_PNG_COMPRESSION ) from 0 to 9. A higher value means a smaller size and longer compression time. Default value is 3.
		For PPM, PGM, or PBM, it can be a binary format flag ( CV_IMWRITE_PXM_BINARY ), 0 or 1. Default value is 1.

3.2.4.cv::imencode

编码图像并保存到内存。

Mat imdecode(InputArray buf, int flags);Mat imdecode(InputArray buf, int flags, Mat* dst);
Describe:
	Reads an image from a buffer in memory.
Parameters:
	buf - Input array or vector of bytes.
	flags - The same flags as in imread().
	dst - The optional output placeholder for the decoded matrix. It can save the image reallocations when the function is called repeatedly for images of the same size.
Returns:
	If the buffer is too short or contains invalid data, the empty matrix/image is returned.

3.2.5.Example

std::vector<uchar> vecBuf;
std::string strBuf;

//imread and imwrite
cv::Mat image = cv::imread("./frame1.jpeg");
cv::imwrite("C:\\Users\\lenovo\\Desktop\\1.jpeg", image);

//imencode and imdecode
bool result = cv::imencode(".jpg", image, vecBuf);
cv::Mat deImg = cv::imdecode(vecBuf, 1);

//std::vector<uchar> convert to std::string
strBuf = std::string(vecBuf.begin(), vecBuf.end());
//std::string convert to std::vector<uchar>
vecBuf = std::vector<uchar>(strBuf.begin(), strBuf.end());

3.3.调整图像大小

3.3.1.cv::resize

void resize(InputArray src, OutputArray dst, Size dsize, double fx = 0,double fy = 0, int interpolation = INTER_LINEAR);
Describe:
	Resizes an image.
Parameters:
	src - input image.
	dst - output image; it has the size dsize (when it is non-zero) or the size computed from src.size(), fx, and fy; the type of dst is the same as of src.
	dsize - output image size; if it equals zero, it is computed as:
		dsize = Size(round(fx*src.cols), round(fy*src.rows))
		Either dsize or both fx and fy must be non-zero.
	fx - scale factor along the horizontal axis; when it equals 0,it is computed as:
		(double)dsize.width/src.cols
	fy - scale factor along the vertical axis; when it equals 0, it is computed as:
		(double)dsize.height/src.rows
	interpolation - interpolation method:
	INTER_NEAREST - a nearest-neighbor interpolation
	INTER_LINEAR - a bilinear interpolation (used by default)
	INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method.
	INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
	INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood

3.3.2.Example

cv::Mat image = cv::imread("./frame1.jpeg");
cv::Mat resizeImg(image.rows / 2, image.cols / 2, CV_8UC3,
cv::Scalar(255, 255, 255));
cv::resize(image, resizeImg, cv::Size(0, 0), 0.5, 0.5);

3.4.图像旋转

3.4.1.图像旋转特定角度

特定角度有：90°、180°、270°。

3.4.1.1.cv::flip

图像翻转（沿x、y轴）。

void flip(InputArray src, OutputArray dst, int flipCode);
Describe:
	Flips a 2D array around vertical, horizontal, or both axes.
Parameters:
	src - input array.
	dst - output array of the same size and type as src.
	flipCode - a flag to specify how to flip the array; 0 means flipping around the x-axis and positive value (for example, 1) means flipping around y-axis. Negative value (for example, -1) means flipping around both axes (see the discussion below for the formulas).

3.4.1.2.cv::transpose

矩阵转置。

void transpose(InputArray src, OutputArray dst);
Describe:
	Transposes a matrix.
Parameters:
	src - input array.
	dst - cv2.transpose(src[, dst]) → dst.

3.4.1.3.Example

cv::Mat image = cv::imread("./frame1.jpeg");

//90°
transpose(flipImg, flipImg);
cv::flip(flipImg, flipImg, 1);
//180°
cv::flip(flipImg, flipImg, -1);
//逆向90°//270°
transpose(image, image);
cv::flip(image, image, 0);

3.4.2.图像旋转任意角度

3.4.2.1.cv::copyMakeBorder

在图像周围添加边框。

void copyMakeBorder(InputArray src, OutputArray dst, int top, int bottom, int left, int right, int borderType, const Scalar& value = Scalar());
Describe:
	Forms a border around an image.
Parameters:
	src - Source image.
	dst - Destination image of the same type as src and the size Size(src.cols+left+right, src.rows+top+bottom).
	top 
	bottom -
	left -
	right - Parameter specifying how many pixels in each direction from the source image rectangle to extrapolate. For example, top=1, bottom=1, left=1, right=1 mean that 1 pixel-wide border needs to be built.
	borderType - Border type, one of the BORDER_* , except for BORDER_TRANSPARENT and BORDER_ISOLATED . When borderType==BORDER_CONSTANT , the function always returns -1, regardless of p and len .
	value - Border value if borderType==BORDER_CONSTANT.

3.4.2.2.cv::getRotationMatrix2D

计算旋转矩阵。

Mat getRotationMatrix2D(Point2f center, double angle, double scale);
Describe:
Transposes a matrix.
Parameters:
	center - Center of the rotation in the source image.
	angle - Rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner).
	scale - Isotropic scale factor.
	map_matrix - The output affine transformation, 2x3 floating-point matrix.
Returns:
	Mat type, means transformation matrix.

3.4.2.3.cv::warpAffine

对图像进行仿射变换。

void warpAffine(InputArray src, OutputArray dst, InputArray M, Size dsize, int flags = INTER_LINEAR, int borderMode = BORDER_CONSTANT, const Scalar& borderValue = Scalar());
Describe:
	Applies an affine transformation to an image.
Parameters:
	src - input image.
	dst - output image that has the size dsize and the same type as src.
	M - 2 x 3 transformation matrix.
	dsize - The output affine transformation, 2x3 floating-point 
	flags - combination of interpolation methods (see resize()) and the optional flag WARP_INVERSE_MAP that means that M is the inverse transformation (dst->src).
	borderMode - pixel extrapolation method (see borderInterpolate()); when borderMode=BORDER_TRANSPARENT , it means that the pixels in the destination image corresponding to the “outliers” in the source image are not modified by the function.
	borderValue - value used in case of a constant border; by default, it is 0.

3.4.2.4Example

void RotateImage(const cv::Mat& inputImg, cv::Mat& outputImg, float angle)
{
    CV_Assert(!inputImg.empty());

    float radian = (float)(angle / 180.0 * CV_PI);
    float sinVal = fabs(sin(radian));
    float cosVal = fabs(cos(radian));

    cv::Size targetSize((int)(inputImg.cols * cosVal + inputImg.rows * sinVal), (int)(inputImg.cols * sinVal + inputImg.rows * cosVal));

    int dx = (int)(inputImg.cols * cosVal + inputImg.rows * sinVal - inputImg.cols) / 2;
    int dy = (int)(inputImg.cols * sinVal + inputImg.rows * cosVal - inputImg.rows) / 2;

    copyMakeBorder(inputImg, outputImg, dy, dy, dx, dx, cv::BORDER_CONSTANT, cv::Scalar(255, 255, 255));

    cv::Point2f center((float)(outputImg.cols / 2), (float)(outputImg.rows / 2));
    cv::Mat affine_matrix = cv::getRotationMatrix2D(center, angle, 1.0);
    warpAffine(outputImg, outputImg, affine_matrix, outputImg.size());
}

3.5.图像裁减

3.5.1.cv::Mat::operator()

提取子矩阵

cv::Mat image = cv::imread("./frame1.jpeg");
cv::Mat cropImage = image(cv::Rect(0, 0, image.cols / 2, image.rows));