最近工作开始接触音视频编解码的方向,学习ffmpeg,阅读AVPacket源码,觉得还是边看边整理效率会高一些。学习是一个输入的过程,但是最终还是要能够把所学输出,印证自己确实理解了,以此构建一个正向反馈的循环。
FFMPEG项目source code doc/examples下会有一些示例,可以先通过这些了解视频编解码的一些基本流程、数据结构和函数。
本文所述是ffmpeg4.2.2 AVPacket 数据结构,从简单的开始。
/**
* This structure stores compressed data. It is typically exported by demuxers
* and then passed as input to decoders, or received as output from encoders and
* then passed to muxers.
*
* For video, it should typically contain one compressed frame. For audio it may
* contain several compressed frames. Encoders are allowed to output empty
* packets, with no compressed data, containing only side data
* (e.g. to update some stream parameters at the end of encoding).
*
* AVPacket is one of the few structs in FFmpeg, whose size is a part of public
* ABI. Thus it may be allocated on stack and no new fields can be added to it
* without libavcodec and libavformat major bump.
*
* The semantics of data ownership depends on the buf field.
* If it is set, the packet data is dynamically allocated and is
* valid indefinitely until a call to av_packet_unref() reduces the
* reference count to 0.
*
* If the buf field is not set av_packet_ref() would make a copy instead
* of increasing the reference count.
*
* The side data is always allocated with av_malloc(), copied by
* av_packet_ref() and freed by av_packet_unref().
*
* @see av_packet_ref
* @see av_packet_unref
*/
typedef struct AVPacket {
/**
* A reference to the reference-counted buffer where the packet data is
* stored.
* May be NULL, then the packet data is not reference-counted.
*/
AVBufferRef *buf;
/**
* Presentation timestamp in AVStream->time_base units; the time at which
* the decompressed packet will be presented to the user.
* Can be AV_NOPTS_VALUE if it is not stored in the file.
* pts MUST be larger or equal to dts as presentation cannot happen before
* decompression, unless one wants to view hex dumps. Some formats misuse
* the terms dts and pts/cts to mean something different. Such timestamps
* must be converted to true pts/dts before they are stored in AVPacket.
*/
int64_t pts;
/**
* Decompression timestamp in AVStream->time_base units; the time at which
* the packet is decompressed.
* Can be AV_NOPTS_VALUE if it is not stored in the file.
*/
int64_t dts;
uint8_t *data;
int size;
int stream_index;
/**
* A combination of AV_PKT_FLAG values
*/
int flags;
/**
* Additional packet data that can be provided by the container.
* Packet can contain several types of side information.
*/
AVPacketSideData *side_data;
int side_data_elems;
/**
* Duration of this packet in AVStream->time_base units, 0 if unknown.
* Equals next_pts - this_pts in presentation order.
*/
int64_t duration;
int64_t pos; ///< byte position in stream, -1 if unknown
#if FF_API_CONVERGENCE_DURATION
/**
* @deprecated Same as the duration field, but as int64_t. This was required
* for Matroska subtitles, whose duration values could overflow when the
* duration field was still an int.
*/
attribute_deprecated
int64_t convergence_duration;
#endif
} AVPacket;
由音视频分离器分出音视频流得到AVPacket之后输入解码器解码,或者由编码器编码所得并将其输入muxer进行音视频合并。对于视频,一个packet对应一个压缩的视频帧frame,对于音频,packet会包括几个压缩音频帧。去除注释其内部结构有: (PS: time_base是一个基本时间单位,一般设置为帧率的倒数,比如60帧,time_base就是1/60,30个time_base就是0.5s)
AVBufferRef *buf; // packet data设为引用计数时需要
int64_t pts; // 显示时间戳
int64_t dts; // 解码时间戳
uint8_t *data; // packet 数据
int size;
int stream_index; // 属于哪路流
int flags;
AVPacketSideData *side_data; // data为空时,可能保存流配置信息
int side_data_elems;
int64_t duration; // 本packet显示的时间,next_pts - this_pts in presentation order
int64_t pos; // packet在流中的位置
ps:个人猜想,buf如果设置了,buf结构里的data和packet的data应该指向同一块内存区域,实验todo