图像格式yuv详解及通过ffmpeg从usb摄像头获取yuv格式图像

最新推荐文章于 2024-08-20 09:47:28 发布

z_muyangren

最新推荐文章于 2024-08-20 09:47:28 发布

阅读量2.9k

点赞数

分类专栏：视音频与图像处理

本文链接：https://blog.csdn.net/z_muyangren/article/details/89684693

版权

视音频与图像处理专栏收录该内容

11 篇文章 0 订阅

订阅专栏

系统环境：ubuntu16.04
FFmpeg：4.1

一、yuv格式
亮度信号经常被称作Y，色度信号是由两个互相独立的信号组成。视颜色系统和格式不同，两种色度信号经常被称作U和V或Pb和Pr或Cb和Cr。这些都是由不同的编码格式所产生的，但是实际上，他们的概念基本相同。在DVD中，色度信号被存储成Cb和Cr（C代表颜色，b代表蓝色，r代表红色）。
1、4:4:4、4:2:2、4:2:0所代表内容
视频工程师发现人眼对色度的敏感程度要低于对亮度的敏感程度。所以，在我们的视频存储中，没有必要存储全部颜色信号，以节省存储空间。
在MPEG2（也就是DVD使用的压缩格式）当中，Y、Cb、Cr信号是分开储存的（这就是为什么分量视频传输需要三条电缆）。其中Y信号是黑白信号，是以全分辨率存储的。但是，由于人眼对于彩色信息的敏感度较低，色度信号并不是用全分辨率存储的。
色度信号分辨率最高的格式是4:4:4，也就是说，每4点Y采样，就有相对应的4点Cb和4点Cr。换句话说，在这种格式中，色度信号的分辨率和亮度信号的分辨率是相同的。这种格式主要应用在视频处理设备内部，避免画面质量在处理过程中降低。当图像被存储到Master Tape的时候，颜色信号通常被削减为4:2:2。
以上介绍及下图来自https://www.cnblogs.com/ALittleDust/p/5935983.html，下图理解起来比较直观，但在使用代码处理图像时，buffer的存储格式并不是如此排列的。
在这里插入图片描述
二、关于FFmpeg里面的sws_scale函数
sws_scale库可以在一个函数里面同时实现：1.图像色彩空间转换；2.分辨率缩放；3.前后图像滤波处理。其核心函数有三个，具体详解如下：

/*
  int srcW, int srcH, enum AVPixelFormat srcFormat定义输入图像信息（寬、高、颜色空间（像素格式））
  int dstW, int dstH, enum AVPixelFormat dstFormat定义输出图像信息寬、高、颜色空间（像素格式））。
  int flags选择缩放算法（只有当输入输出图像大小不同时有效）
  SwsFilter *srcFilter, SwsFilter *dstFilter分别定义输入/输出图像滤波器信息，如果不做前后图像滤波，输入NULL
  const double *param定义特定缩放算法需要的参数(?)，默认为NULL
  函数返回SwsContext结构体，定义了基本变换信息
*/
struct SwsContext *sws_getContext(int srcW, int srcH, enum AVPixelFormat srcFormat,
    int dstW, int dstH, enum AVPixelFormat dstFormat,
    int flags, SwsFilter *srcFilter, SwsFilter *dstFilter, const double *param);

/*
  struct SwsContext *c，为上面sws_getContext函数返回值；
  const uint8_t *const srcSlice[], const int srcStride[]定义输入图像信息（当前处理区域的每个通道数据指针，每个通道行字节数）
      stride定义下一行的起始位置。stride和width不一定相同，这是因为：
	  1.由于数据帧存储的对齐，有可能会向每行后面增加一些填充字节这样 stride = width + N；
	  2.packet色彩空间下，每个像素几个通道数据混合在一起，例如RGB24，每个像素3字节连续存放，因此下一行的位置需要跳过3*width字节。
		  srcSlice和srcStride的维数相同，由srcFormat值来。
		  csp       维数        宽width      跨度stride      高
		  YUV420     3        w, w/2, w/2    s, s/2, s/2   h, h/2, h/2
		  YUYV       1        w, w/2, w/2    2s, 0, 0      h, h, h
		  NV12       2        w, w/2, w/2    s, s, 0       h, h/2
		  RGB24      1        w, w,   w      3s, 0, 0      h, 0, 0           
  int srcSliceY, int srcSliceH,定义在输入图像上处理区域，srcSliceY是起始位置，srcSliceH是处理多少行。如果srcSliceY=0，srcSliceH=height，表示一次性处理完整个图像。
	  这种设置是为了多线程并行，例如可以创建两个线程，第一个线程处理 [0, h/2-1]行，第二个线程处理 [h/2, h-1]行。并行处理加快速度。
  uint8_t *const dst[], const int dstStride[]定义输出图像信息（输出的每个通道数据指针，每个通道行字节数）
*/
int sws_scale(struct SwsContext *c, const uint8_t *const srcSlice[], const int srcStride[], 
    int srcSliceY, int srcSliceH,uint8_t *const dst[], const int dstStride[]);

/*释放资源*/
void sws_freeContext(struct SwsContext *swsContext);

三、从设备获取yuv格式图像

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

#include "libavdevice/avdevice.h"
#include "libavutil/pixfmt.h"
#include "libswscale/swscale.h"
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libavutil/time.h"
#include "libswresample/swresample.h" 
#include<stdio.h>

static void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame);

int main()
{
    AVFormatContext *     pFormatCtx = NULL;
    AVCodecContext *      pVCodecCtx = NULL;
    AVCodec *             pVCodec = NULL;
    AVCodecContext *      pACodecCtx = NULL;
    AVCodec *             pACodec = NULL;
    AVFrame  *            pFrame = NULL;
    AVFrame  *            pFrameRGB = NULL;
    AVPacket              packet;
    enum AVPixelFormat    pixFormat = 0;
    struct SwsContext *   pSwsCtx = NULL;

    int videoindex = -1;
    int audioindex = 0;
    int sampleRate = 0;
    int sampleSize = 0;
    int channel = 0;
    int gotPicture = 0;
    int i = 0;
    int ret = 0;

    enum AVPixelFormat dst_pix_fmt = AV_PIX_FMT_YUV422P;
    const char *dst_size = NULL;  
    const char *src_size = NULL;  
    uint8_t *src_data[4];   
    uint8_t *dst_data[4];  
    int src_linesize[4];  
    int dst_linesize[4];  
    int src_bufsize = 0;  
    int dst_bufsize = 0;  
    int src_w = 0;  
    int src_h = 0;  
    int dst_w = 0;
    int dst_h = 0;
    
    const char* out_path = "/home/mo/project_sf/ffmpeg_devicegetyuv/out.yuv";

    av_register_all();
    avformat_network_init();
    pFormatCtx = avformat_alloc_context();

    //(1)register device
    avdevice_register_all();

    //(2)find and connect device
    AVInputFormat *ifmt = av_find_input_format("video4linux2");
    if(ifmt !=  NULL)
    {
        printf("device name is %s.\n", ifmt->name);
    }
    else
    {
        printf("there is no device!\n");
        return;
    }

    //(3)open device
    ret = avformat_open_input(&pFormatCtx, "/dev/video0", ifmt, NULL);
    if(0 == ret)
    {
        printf("open input device success!\n");
    }
    //av_dump_format(&pFormatCtx, 0, "/dev/video0", 0);

    //(4) to get which format of the video
    for(uint i=0; i < pFormatCtx->nb_streams; i++)
    {
        if(pFormatCtx->streams[i]->codec->codec_type==AVMEDIA_TYPE_VIDEO)
        {
        videoindex = i;
        printf("video index is %d!\n", videoindex);

        //(5) to find the decoder
        pVCodecCtx = pFormatCtx->streams[videoindex]->codec;
        pVCodec = avcodec_find_decoder(pVCodecCtx->codec_id);
       }
    }
    printf("picture width   =  %d \n", pVCodecCtx->width);
    printf("picture height  =  %d \n", pVCodecCtx->height);
    printf("Pixel   Format  =  %d \n", pVCodecCtx->pix_fmt);

    //(6)malloc buffer
    AVFrame *frame_yuv = av_frame_alloc();
    if(!frame_yuv)
    {
        printf("av_frame_alloc error!");
        return;
    }

    //(7)get one frame
    memset(&packet, 0, sizeof(AVPacket));
    av_read_frame(pFormatCtx, &packet);  
    //std::cout << "packet size:" << (pcaket->size) << std::endl;  
    printf("packet size is %d.\n", packet.size);

    //(8)write file 
    FILE *fp = NULL;  
    fp = fopen("out.yuv", "wb");  
    
    dst_w = pVCodecCtx->width;
    dst_h = pVCodecCtx->height;
    //dst_pix_fmt = pVCodecCtx->pix_fmt;
    pSwsCtx = sws_getContext(pVCodecCtx->width, pVCodecCtx->height, pVCodecCtx->pix_fmt, dst_w, dst_h, dst_pix_fmt, SWS_BILINEAR, NULL, NULL, NULL); 
    src_bufsize = av_image_alloc(src_data, src_linesize, pVCodecCtx->width, pVCodecCtx->height, pVCodecCtx->pix_fmt, 16);
    dst_bufsize = av_image_alloc(dst_data, dst_linesize, dst_w, dst_h, dst_pix_fmt, 1);
                                            
    memcpy(src_data[0], packet.data, packet.size);
  
    printf("src_bufsize is %d.\n", src_bufsize);
    printf("dst_bufsize is %d.\n", dst_bufsize);
    sws_scale(pSwsCtx, src_data,  src_linesize, 0, pVCodecCtx->height, dst_data, dst_linesize); 
    fwrite(dst_data[0], 1, dst_bufsize, fp);

    fclose(fp);  
    sws_freeContext(pSwsCtx);
    av_free_packet(&packet); 
    avformat_close_input(&pFormatCtx);  
        
    return 0;  
}