Ffmpeg框架结构解读与编码、解码流程

最新推荐文章于 2023-06-26 17:05:16 发布

钱小驴

最新推荐文章于 2023-06-26 17:05:16 发布

阅读量543

点赞数 1

分类专栏：视频编码文章标签： ffmpeg 编码结构框架

视频编码专栏收录该内容

2 篇文章 0 订阅

订阅专栏

转载地址： http://blog.csdn.NET/allen_young_yang/article/details/6576303

1、 FFMEPG结构说明
1.1》介绍
ffmpeg(Fast Forward Moving Pictures Experts Group)是音视频的分离，转换，编码解码及流媒体的完全解决方案，其中最重要的就是libavcodec库，是一个集录制、转换、音/视频编码解码功能为一体的完整的开源解决方案。ffmpeg的开发是基于Linux操作系统，但是可以在大多数操作系统中编译和使用。FFmpeg支持MPEG、 DivX、MPEG4、AC3、DV、FLV等40多种编码，AVI、MPEG、OGG、Matroska、ASF等90多种解码. TCPMP, VLC, MPlayer等开源播放器都用到了FFmpeg。
ffmpeg主目录下主要有libavcodec、libavformat和libavutil等子目录。其中
 libavcodec用于存放各个encode/decode模块，CODEC其实是Coder/Decoder的缩写，也就是编码解码器；用于各种类型声音/图像编解码
 libavformat用于存放muxer/demuxer模块，对音频视频格式的解析;用于各种音视频封装格式的生成和解析，包括获取解码所需信息以生成解码上下文结构和读取音视频帧等功能；
其中库 libavcodec，libavformat用于对媒体文件进行处理，如格式的转换；
 libavutil集项工具，包含一些公共的工具函数；用于存放内存操作等辅助性模块，是一个通用的小型函数库，该库中实现了CRC校验码的产生，128位整数数学，最大公约数，整数开方，整数取对数，内存分配，大端小端格式的转换等功能
 libavdevice：对输出输入设备的支持；
 libpostproc：用于后期效果处理；
 libswscale：用于视频场景比例缩放、色彩映射转换；
 ffmpeg：该项目提供的一个工具，可用于格式转换、解码或电视卡即时编码等；
 fsever：一个 HTTP 多媒体即时广播串流服务器；
 ffplay：是一个简单的播放器，使用ffmpeg 库解析和解码，通过SDL显示；
ffmpeg软件包经编译过后将生成三个可执行文件，ffmpeg，ffserver，ffplay。其中ffmpeg用于对媒体文件进行处理，ffserver是一个http的流媒体服务器，ffplay是一个基于SDL的简单播放器。

说明：
muxer/demuxer和encoder/decoder的区别：
 最大的差别是muxer 和demuxer分别是不同的结构AVOutputFormat与AVInputFormat；
 而encoder和decoder都是用的AVCodec 结构。
 muxer/demuxer是分别保存在全局变量AVOutputFormat *first_oformat与AVInputFormat *first_iformat中的。encoder/decoder都是保存在全局变量AVCodec *first_avcodec中的。
muxer/demuxer和encoder/decoder的相同之处：
 都是在main()开始的av_register_all()函数内初始化的
 都是以链表的形式保存在全局变量中的
 都用函数指针的方式作为开放的公共接口

1.2》下载与编译
官方下载网址http://ffmpeg.org/download.html
编译./configure
#make
#make install

安装到/usr/local/bin、/usr/local/include（包含各个头文件）、/usr/local/lib（生成.a文件），编译完毕后
A》执行./ffmpeg，结果如下：
FFmpeg version SVN-r17579, Copyright (c) 2000-2009 Fabrice Bellard, et al.
configuration:
libavutil 49.15. 0 / 49.15. 0
libavcodec 52.19. 0 / 52.19. 0
libavformat 52.30. 0 / 52.30. 0
libavdevice 52. 1. 0 / 52. 1. 0
built on Mar 25 2011 17:30:17, gcc: 4.3.4
At least one output file must be specified
B》执行./ffplay，结果如下：
FFplay version SVN-r17579, Copyright (c) 2003-2009 Fabrice Bellard, et al.
configuration:
libavutil 49.15. 0 / 49.15. 0
libavcodec 52.19. 0 / 52.19. 0
libavformat 52.30. 0 / 52.30. 0
libavdevice 52. 1. 0 / 52. 1. 0
built on Mar 25 2011 17:30:17, gcc: 4.3.4
An input file must be specified
C》执行./ffserver，结果如下：
FFserver version SVN-r17579, Copyright (c) 2000-2009 Fabrice Bellard, et al.
configuration:
libavutil 49.15. 0 / 49.15. 0
libavcodec 52.19. 0 / 52.19. 0
libavformat 52.30. 0 / 52.30. 0
libavdevice 52. 1. 0 / 52. 1. 0
built on Mar 25 2011 17:30:17, gcc: 4.3.4
/etc/ffserver.conf: No such file or directory
Incorrect config file - exiting.
说明：如果缺少fserver.conf文件，需在/etc/中增加ffserver.conf文件。

2、 Ffmpeg编码、解码
2.1》主要流程如下：
 输入流初始化input streams initializing
 输出流初始化output streams initializing
 编码器和解码器初始化encoders and decoders initializing
 如有需要的情况下，设置来自输入文件的Meta数据信息set meta data information from input file if required.
 写输出文件头文件write output files header
 循环处理每个数据单元loop of handling each frame（frame是指Stream中的一个数据单元）
 从输入文件中读取数据单元read frame from input file:
 解码数据单元内数据decode frame data
 编码数据单元内数据encode new frame data
 写新的数据单元到输出文件中write new frame to output file
 写输出文件的尾文件write output files trailer
 关闭每个编码器和解码器close each encoder and decoder
说明：
av_encode函数是FFMpeg中最重要的函数，编码/解码和输出等大部分功能都在此函数完成。
ffmpeg.c中

av_encode(AVFormatContext **output_files,
                     int nb_output_files,
                     AVFormatContext **input_files,
                     int nb_input_files,
                     AVStreamMap *stream_maps, int nb_stream_maps)

AVFormatContext是FFMpeg格式转换过程中实现输入和输出功能、保存相关数据的主要结构。每一个输入和输出文件，都在如下定义的指针数组全局变量中有对应的实体。
static AVFormatContext *output_files[MAX_FILES];
static AVFormatContext *input_files[MAX_FILES];
对于输入和输出，因为共用的是同一个结构体，所以需要分别对该结构中如下定义的iformat或oformat成员赋值。
struct AVInputFormat *iformat;
struct AVOutputFormat *oformat;
对一个AVFormatContext来说，这二个成员不能同时有值，即一个AVFormatContext不能同时含有demuxer和muxer。在
main( )函数开头的parse_options(
)函数中找到了匹配的muxer和demuxer之后，根据传入的argv参数，初始化每个输入和输出的AVFormatContext结构，并保存在相
应的output_files和input_files指针数组中。在av_encode(
)函数中，output_files和input_files是作为函数参数传入后，在其他地方就没有用到了。

AVCodecContext保存AVCodec指针和与codec相关数据，如video的width、height，audio的sample
rate等。AVCodecContext中的codec_type，codec_id二个变量对于encoder/decoder的匹配来说，最为重
要。
enum CodecType codec_type; /* see CODEC_TYPE_xxx */
enum CodecID codec_id; /* see CODEC_ID_xxx */
codec_type保存的是CODEC_TYPE_VIDEO，CODEC_TYPE_AUDIO等媒体类型，codec_id保存的是CODEC_ID_FLV1，CODEC_ID_VP6F等编码方式。

AVStream结构保存与数据流相关的编解码器，数据段等信息。比较重要的有如下二个成员：
AVCodecContext codec; /*< codec context */
void *priv_data;
其中codec指针保存的就是encoder或decoder结构。priv_data指针保存的是和具体编解码流相关的数据。

AVInputStream/ AVOutputStream根据输入和输出流的不同，前述的AVStream结构都是封装在AVInputStream和AVOutputStream
结构中，在av_encode(
)函数中使用。AVInputStream中还保存的有与时间有关的信息。AVOutputStream中还保存有与音视频同步等相关的信息。
2.2》视频文件解码流程
A》初始化 libavcodec库，并注册所有容器格式（format）、编解码器CODEC、，解析器（parsers）以及码流过滤器（bitstream
filters），打开一个文件时，自动选择相应的文件格式和编码器：
avcodec_register_all();
avdevice_register_all();
av_register_all();
avformat_alloc_context();分配播放avformat的上下文，分配输出媒体内容。

B》打开文件: av_open_input_file()

 int av_open_input_file(AVFormatContext **ic_ptr, const char *filename,
                       AVInputFormat *fmt,
                       int buf_size,
                       AVFormatParameters *ap)
    {
           ......
        if (!fmt) {
            /* guess format if no file can be opened */
            fmt = av_probe_input_format(pd, 0);
        }
           ......
        err = av_open_input_stream(ic_ptr, pb, filename, fmt, ap);
           ......
    }

主要是两件事情：

 侦测容器文件格式（是在AVFormatContext定义中）；
 从容器文件获取Stream的信息，就是调用特定文件的demuxer以分离Stream的过程，描述如下:
av_open_input_file
 av_probe_input_format2()从first_iformat中遍历注册的所有demuxer以调用相应的probe函数
 av_open_input_stream()调用指定demuxer的read_header函数以获取相关流的信息ic->iformat->read_header

C》从文件中提取流信息: av_find_stream_info()用有效的信息把 AVFormatContext 的流域（streams
field）填满。对于音频／视频每个Packet包含完整的或多个复合的frame。从文件中读取packet，从Packet中解码相应的
frame。

av_find_stream_info(AVFormatContext *ic)主要是两部分：
 一部分是使用av_open_input_file()解复用(demuxer)
 然后是使用av_read_frame(AVFormatContext *s, AVPacket *pkt)和 avcodec_decode_video() 解码(decode)

D》遍历所有的流，查找其中种类为CODEC_TYPE_VIDEO，描叙如下：

int i;
AVCodecContext *pCodecCtx;

// Find the first video stream
videoStream=-1;
for(i=0; i<pFormatCtx->nb_streams; i++)
  if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO) {
    videoStream=i;
    break;
  }
if(videoStream==-1)
  return -1; // Didn't find a video stream

// Get a pointer to the codec context for the video stream
pCodecCtx=pFormatCtx->streams[videoStream]->codec;

E》查找对应的解码器: avcodec_find_decoder()；若成功后，打开解码器 avcodec_open()用给定的
AVCodec来初始化AVCodecContext，描叙如下：

AVCodec *pCodec;

// Find the decoder for the video stream
pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
if(pCodec==NULL) {
  return -1; // Codec not found
}
// Open codec
if(avcodec_open(pCodecCtx, pCodec)<0)
  return -1; // Could not open codec

F》为解码帧分配内存: avcodec_alloc_frame()，用于存在帧数据

G》不停地从解码流中提取中帧数据: av_read_frame()

int frameFinished;
AVPacket packet;

i=0;
while(av_read_frame(pFormatCtx, &packet)>=0) {
  // Is this a packet from the video stream?
  if(packet.stream_index==videoStream) {
    // Decode video frame
    avcodec_decode_video(pCodecCtx, pFrame, &frameFinished,
                         packet.data, packet.size);

    // Did we get a video frame?
    if(frameFinished) {
    // Convert the image from its native format to RGB32
        img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB32, 
            (AVPicture*)pFrame, pCodecCtx->pix_fmt, 
            pCodecCtx->width, pCodecCtx->height);

        // Save the frame to disk
           ......
    }
  }

  // Free the packet that was allocated by av_read_frame
  av_free_packet(&packet);
}

H》判断帧的类型，对于视频帧调用指定Codec的解码函数: avcodec_decode_video()
I》解码完后，释放解码器: avcodec_close()
J》关闭输入文件:av_close_input_file()

3、代码标记Log
根据2.2》项中所描述的视频解码流程，作Log标记（用printf()方法输出）、跟踪视频解码过程。从ffmpeg自带的ffplay播放器着手，跟踪ffplay.c的主函数main()中涉及的调用函数。

/* Called from the main */
int main(int argc, char **argv)
{
    /* register all codecs, demux and protocols */
    avcodec_register_all();
    avdevice_register_all();
    av_register_all();
    ......
    avformat_opts = avformat_alloc_context();
    sws_opts = sws_getContext(16,16,0, 16,16,0, sws_flags, NULL,NULL,NULL);
    show_banner();
    parse_options(argc, argv, options, opt_input_file);
    ......
    cur_stream = stream_open(input_filename, file_iformat);
event_loop();

    /* never returns */
    return 0;
}

跟踪结果如下：
root@localhost /work/ffmpeg>ffplay /work/test/avi/output.avi
beginning avcodec_register_all… _by jay remarked beginning
avdevice_register_all… _by jay remarked beginning
av_register_all… _by jay remarked registering MuxDemux MP3… _by
jay remarked returning av_register_all’s initialized

avctx_opts[0] avctx_opts[1] avctx_opts[2] avctx_opts[3] avctx_opts[4]
returning avformat_alloc_context value…_by jay remarked returning
sws_getContex value…_by jay remarked

showing version banner…_by jay remarked FFplay version SVN-r17579
_by Jay remarked, Copyright (c) 2003-2009 Fabrice Bellard, et al. configuration: libavutil 49.15. 0 / 49.15. 0 libavcodec
52.19. 0 / 52.19. 0 libavformat 52.30. 0 / 52.30. 0 libavdevice 52. 1. 0 / 52. 1. 0 built on Apr 1 2011 09:29:06, gcc: 4.3.4

beginning parse_options… _by jay remarked returning optindex=[2]
beginning av_init_packet… _by jay remarked beginning cur_stream…
_by jay remarked returning av_open_input_file’s pd->filename=[T] [mp3 @ 0x9b26d20]mdb:432, lastbuf:0 skipping granule 0
Last message repeated 1 times [mp3 @ 0x9b26d20]mdb:432, lastbuf:0 skipping granule 1
Last message repeated 1 times [mp3 @ 0x9b26d20]mdb:460, lastbuf:216 skipping granule 0
Last message repeated 1 times [mp3 @ 0x9b26d20]mdb:460, lastbuf:216 skipping granule 1 returning av_close_input_file
successful