FFmpeg和SDL教程之一（Making Screencaps）

原文网址:http://dranger.com/ffmpeg/tutorial01.html

概览

电影文件有一些基本的组成元素。第一，文件本身被称为容器，并且容器的类型决定了文件内部信息的布局。例如，AVI和Quicktime格式的容器。

第二，你有一串数据流，例如，通常有音频流和视频流（流只是随着时间流动的连续数据元素的形象比喻）。流中的数据元素称为帧，每个流使用不同的编解码器来编码。

CODEC定义了实际数据是如何编码和解码的，因此被称为CODEC。例如有编解码器DivX和MP3。Packets是从流中读取的。Packets是一些数据片包含许多比特流，这些数据能够解码成为原始帧以便我们能够在应用中操作它。每个包包含完整的帧（each packet contains complete frames），或者是多帧（对音频而言）。

在基本应用上，处理音视频流是非常简单的：

10 OPEN video_stream FROM video.avi
20 READ packet FROM video_stream INTO frame
30 IF frame NOT COMPLETE GOTO 20
40 DO SOMETHING WITH frame
50 GOTO 20

使用ffmpeg来处理多媒体就像上面程序那样简单，尽管有些程序可能会有复杂的操作（DO SOMETHING）步骤，在这个教程中，我们决定按以下步骤来做：

（1）打开文件

（2）读取其中的视频流

（3）把帧数据写到一个PPM文件中

打开文件

首先，使用ffmpeg，你必须初始化库（注意有些系统可能必须使用<ffmpeg/avcodec.h>和<ffmpeg/avformat.h>）

#include <avcodec.h>
#include <avformat.h>
...
int main(int argc, charg *argv[]) {
av_register_all();

这一步注册(register)库中所有可用的文件格式和编解码器。当一个有自己特定文件格式和编解码器的文件被打开时，它们能够自动的被调用。注意你只需要调用 av_register_all() 函数一次，这里选择在main()中使用。当然，你也可以注册具体的文件格式和编解码器，但这不是完全必要的。

下面我们真正的打开一个文件：

AVFormatContext *pFormatCtx;

// Open video file
if(av_open_input_file(&pFormatCtx, argv[1], NULL, 0, NULL)!=0)
  return -1; // Couldn't open file

我们从第一个参数中获取文件名，av_open_input_file()函数读取文件头，并且把文件格式信息保存在 AVFormatContext 结构体中。最后的三个参数分别代表：format，buffer size，format options。把它们设置为 NULL 或 0，libavformat会自动检测这些参数。

上面的函数只是读取文件头，下面我们检查文件中的流信息：

// Retrieve stream information
if(av_find_stream_info(pFormatCtx)<0)
  return -1; // Couldn't find stream information

这个函数将 pFormatCtx->streams填入合适的信息。我们可以使用下面这个调试函数来显示其内部细节：

// Dump information about file onto standard error
dump_format(pFormatCtx, 0, argv[1], 0);

现在 pFormatCtx->streams 仅仅是一个指针数组，包含指针个数为 pFormatCtx->nb_streams，下面我们使用循环语句，直到找到一个视频流：

int i;
AVCodecContext *pCodecCtx;

// Find the first video stream
videoStream=-1;
for(i=0; i<pFormatCtx->nb_streams; i++)
  if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO) {
    videoStream=i;
    break;
  }
if(videoStream==-1)
  return -1; // Didn't find a video stream

// Get a pointer to the codec context for the video stream
pCodecCtx=pFormatCtx->streams[videoStream]->codec;

我们将关于流编解码器的信息称为“编解码器上下文”。它包含了所有关于正在使用的流的编解码器信息，并且使用一个指针指向它。但是我们必须找到实际的编解码器然后打开它：

AVCodec *pCodec;

// Find the decoder for the video stream
pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
if(pCodec==NULL) {
  fprintf(stderr, "Unsupported codec!\n");
  return -1; // Codec not found
}
// Open codec
if(avcodec_open(pCodecCtx, pCodec)<0)
  return -1; // Could not open codec

保存数据

下面需要一个实际存储帧数据的空间：

AVFrame *pFrame;

// Allocate video frame
pFrame=avcodec_alloc_frame();

我们打算输出为 PPM文件（24-bit RGB），这样将把帧数据从原始格式转化为RGB。ffmpeg可以实现这些转化。

// Allocate an AVFrame structure
pFrameRGB=avcodec_alloc_frame();
if(pFrameRGB==NULL)
  return -1;

尽管我们分配了帧数据空间，但还需要一个存放原始数据的地方，使用 avpicture_get_size（）函数获取数据大小，然后手动分配空间：

uint8_t *buffer;
int numBytes;
// Determine required buffer size and allocate buffer
numBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
                            pCodecCtx->height);
buffer=(uint8_t *)av_malloc(numBytes*sizeof(uint8_t));

av_malloc() 是ffmpeg自带的一个内存申请函数，是对malloc（）函数的一个封装，并且保证内存空间地址的对齐。但不会排除内存的泄露，重复释放(double freeing)，或者其的malloc问题。

下面使用 avpicture_fill（）将相关帧数据与我们分配的内存空间联系起来。关于 AVPicture变量：AVPicture结构体是 AVFrame结构体的子集，AVFrame结构体与AVPicture结构体开头部分相同：

// Assign appropriate parts of buffer to image planes in pFrameRGB
// Note that pFrameRGB is an AVFrame, but AVFrame is a superset
// of AVPicture
avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24,
                pCodecCtx->width, pCodecCtx->height);

最后，我们准备读取流信息。

读取数据

接下来，不断地读取从包中获得的流信息，解码为帧数据，一旦帧数据完成，就进行转换和保存操作：

int frameFinished;
AVPacket packet;

i=0;
while(av_read_frame(pFormatCtx, &packet)>=0) {
  // Is this a packet from the video stream?
  if(packet.stream_index==videoStream) {
	// Decode video frame
    avcodec_decode_video(pCodecCtx, pFrame, &frameFinished,
                         packet.data, packet.size);
    
    // Did we get a video frame?
    if(frameFinished) {
    // Convert the image from its native format to RGB
        img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24, 
            (AVPicture*)pFrame, pCodecCtx->pix_fmt, 
			pCodecCtx->width, pCodecCtx->height);
	
        // Save the frame to disk
        if(++i<=5)
          SaveFrame(pFrameRGB, pCodecCtx->width, 
                    pCodecCtx->height, i);
    }
  }
    
  // Free the packet that was allocated by av_read_frame
  av_free_packet(&packet);
}

av_read_frame()函数读包并将其保存在 AVPacket结构体中。注意我们只分配了packet structure ，ffmpeg为我们分配了内部的数据，使用 packet.data 来指向。由av_free_packet() 函数来释放。avcodec_decode_video（）函数将包转换为帧数据。然而，在解码包后，我们可能得不到关于帧数据的全部信息。因此 avcodec_decode_video（）函数设置此帧结束，在我们读取下一帧之前。

最后，我们使用 img_convert() 函数将原始数据（pCodecCtx->pix_fmt）转化为 RGB。你可以将 AVFrame指针转化为 AVPicture指针。然后传递帧数据的高、宽信息到 SaveFrame 函数中。

下面要做的是，使用 SaveFrame 函数将 RGB信息写入到 PPM 格式的文件中：

void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame) {
  FILE *pFile;
  char szFilename[32];
  int  y;
  
  // Open file
  sprintf(szFilename, "frame%d.ppm", iFrame);
  pFile=fopen(szFilename, "wb");
  if(pFile==NULL)
    return;
  
  // Write header
  fprintf(pFile, "P6\n%d %d\n255\n", width, height);
  
  // Write pixel data
  for(y=0; y<height; y++)
    fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);
  
  // Close file
  fclose(pFile);
}

我们每次向文件中写入一行数据，一个PPM文件是一个简单的长字符串文件，有着RGB格式信息的布局。

从视频流中读取完信息后，要进行一些释放处理：

// Free the RGB image
av_free(buffer);
av_free(pFrameRGB);

// Free the YUV frame
av_free(pFrame);

// Close the codec
avcodec_close(pCodecCtx);

// Close the video file
av_close_input_file(pFormatCtx);

return 0;

我们使用 av_free（）函数来释放由 avcode_alloc_frame 和 av_malloc 函数分配的内存空间。

以上是全部代码，在linux系统或者相似的平台上使用下面的指令来运行：

gcc -o tutorial01 tutorial01.c -lavutil -lavformat -lavcodec -lz -lavutil -lm

如果是老版本的 ffmpeg，需要去掉 -lavutil :

gcc -o tutorial01 tutorial01.c -lavformat -lavcodec -lz -lm

大部分图片程序都能够打开 PPM 文件，使用一些电影文件来测试吧！