最近在研究ffmpeg,发现网上关于ffmpeg解封装的源码分析不多而且不全,所以这里总结一下,我自己对ffmpeg解封装mov、mp4格式的源码分析主要就是关于mov.c的源码分析,让各位同学了解一下,mp4的流AVStream、AVPacket是如何赋值的,这两个结构体变量会是解码的输入数据,了解解封装过程,如pts、dts是如何得到的,有助于ffmpeg的二次开发。关于MP4协议,网上有详细的描述,我这里就不在写了,直接上干货,代码书注释。
所有接口和数据结构写的都很详细,但是研究了好一阵,写起来超级麻烦,好累的,看完给小弟点个关注呗 哈哈哈哈哈
重点小Tips:
1.ffmpeg中很多结构体(AVStream、URLContext、AVFormatContext)很喜欢用void *priv_data变量
其实这个变量是用来存储该结构体 “子结构体”的,如AVStream中的priv_data就是用来存储mov协议MOVStreamContext结构体变量的,URLContext中的priv_data就是用来存储file协议中FileContext结构体的,这样做其实是为了分离协议接口功能或数据和主干接口,使整个库有可扩展性。所以你会发现在各种协议接口的开始都会讲主干的priv_data赋值给协议自身的结构体中。如:mov_read_stsd当中的
MOVStreamContext *sc = st->priv_data;这样书写,也是一种语法糖,sc 不会受priv_data名称的影响。
即使外部变量如命名有变化也会很少的影响内部接口。ffmpeg的接口大多都用到这种方式,尤其是涉及到一些外部协议
rtmp流媒体、file文件、mov格式等。
2.对于Context这种命名比如:URLContext、FileContext、AVFormatContext等我个人的理解就是要完成功能所需要的数据+方法(接口)。如URLContext当中就有 file协议FileContext结构体里面有 open、close、read等方法和uint*data变量用来存储从文件当中读取的数据。这里是一级一级存储的,为了代码有更好的扩展性,这种库是好多人写的呀。不知道我解释清楚没有,哈哈哈哈哈哈哈。
3.对于internal这种命名如AVStreamInternal,一般是用来存储数据并传递给接口使用的
4.因为多媒体文件都是字节流形式所以接口 AV_RL32读取4个字节以大端方式读取 av_rl32以小端方式读取
本篇主要讲述MP4格式中最最最重要的trak box(atom)模块中的mov_read_stsd(stts)、(stss)、(ctts)、(stsc)、(stsz)、(stco)接口分析,因为这些接口会根据(stsd)、(stts)、(stss)、(ctts)、(stsc)、(stsz)、(stco)这些box的信息,得出sample(音视频的一帧)的信息,根据这些信息就能定位出,编码后的音视频数据在整个文件的位置,从而可以通过AVIOContext的变量读出来,存入AVPacket的变量当中。
在代码中atom,其实就是MP4协议中的box,代码中或协议中经常提到的sample实际上是音视频的一帧,这个提醒一下重点接口哈
//mov格式stream的结构体,里面存有音视频的sample信息(大小、序号、关键帧等),但是一般音视频
//没有那么多种类的atom,这里我主要以电影为主,所有的变量注释都是解析电影MP4格式所用到的剩下的一般用不到
//结构体中很喜欢用指针数组来表示序列如:int *keyframes关键帧序列
//Entry是MP4格式协议中的一种概念,你可以当它是一种结构体,就像MP4是以box(atom)作为存储的概念一样
typedef struct MOVStreamContext {
AVIOContext *pb;
int pb_is_copied;
int ffindex; //< AVStream index 一般为(0或1 也就是音频或视频)
int next_chunk;
unsigned int chunk_count; //chunk 的总数(一个chunk中有几个sample)
int64_t *chunk_offsets; //stco 每个chunk相对整体文件的绝对偏移(即相对整个文件头的位置)
//为了找到每个chunk,不依靠其他参数
unsigned int stts_count; //sample的dts信息stts Entry 个数
MOVStts *stts_data;//sample的dts信息stts data结构
typedef struct MOVStts {
unsigned int count; //相同duration的sample数量
int duration; //每个sample的dts的偏差值 也就是 delta增量
} MOVStts;
unsigned int ctts_count; //sample的dts和pts偏移量信息 ctts Entry 个数
//开始读取ctts atom 的时候ctts_count为ctts entry结构的个数
//但是经过mov_build_index接口需要重新给ctts_data赋值时(因为有的ctts不止一个sample所以总的ctts_count会少于sample数量),ctts_count为sample数量
unsigned int ctts_allocated_size; //已经分配ctts个数
MOVStts *ctts_data;//sample的dts和pts偏移量信息ctts data结构
unsigned int stsc_count; //chunk中有多少sample的信息stsc entry结构的个数(注意:不是chunk数目,因为stsc得到的是chunk的序号)
MOVStsc *stsc_data;//chunk中有多少sample的息 stsc data结构
typedef struct MOVStsc {
int first;//chunk 中的第一个sample的id (一个chunk中有一个或多个sample)
int count;//每个chunk中sample数量
int id; //Sample description 一般为1
} MOVStsc;
unsigned int stsc_index; //stts_data数组下表
int stsc_sample;
unsigned int stps_count;
unsigned *stps_data; ///< partial sync sample for mpeg-2 open gop
MOVElst *elst_data; //决定第一个sample的DTS信息 edit list 数据
typedef struct MOVElst {
int64_t duration;//sample的总时间
int64_t time;//sample的dts起始值(取time的负数就是dts的第一个值)
float rate; //sample rate 一般为1
} MOVElst;
unsigned int elst_count; //elst entry结构的个数 (一般为1)
int ctts_index;//ctts_data数组下表
int ctts_sample;
unsigned int sample_size;//如果所有sample值相同就是这个值,否则sample_size==0< may contain value calculated from stsd or value from stsz atom
unsigned int stsz_sample_size; 如果所有sample值相同就是这个值,否则stsz_sample_size==0< always contains sample size from stsz atom
unsigned int sample_count;//所有帧个数(帧总数)
int *sample_sizes; //每一个帧的大小
int keyframe_absent; //是否不要关键帧
unsigned int keyframe_count; //关键帧个数
int *keyframes; //关键帧数组(以int类型指针数组形式存储关键帧)
int time_scale; //mdhd box中的时间缩放比例
int64_t time_offset; //sample的dts起始值(取time_offset的负数就是dts的第一个值)
int64_t min_corrected_pts; //sample的dts起始值(取time的负数就是dts的第一个值)< minimum Composition time shown by the edits excluding empty edits.
int current_sample; //当前sample序号
int64_t current_index;//当前sample序号
MOVIndexRange* index_ranges;
MOVIndexRange* current_index_range;
unsigned int bytes_per_frame; //音频需要的数据(AAC格式一般用不到)
unsigned int samples_per_frame;//音频需要的数据(AAC格式一般用不到)
int dv_audio_container;
int pseudo_stream_id; //stsd Entry数目一般为1 -1 means demux all ids
int16_t audio_cid; ///< stsd audio compression id
unsigned drefs_count;
MOVDref *drefs; //和网络流媒体有关
int dref_id;
int timecode_track;
int width; ///< tkhd width
int height; ///< tkhd height
int dts_shift; //一般值为0 dts shift when ctts is negative
uint32_t palette[256];
int has_palette;
int64_t data_size; //所有帧大小总和
uint32_t tmcd_flags; ///< tmcd track flags
int64_t track_end; //帧结尾位置(也就是duration时间总数) ///< used for dts generation in fragmented movie files
int start_pad; ///< amount of samples to skip due to enc-dec delay
unsigned int rap_group_count;
MOVSbgp *rap_group;
int nb_frames_for_fps; //所有帧个数(帧总数)
int64_t duration_for_fps; //所有帧的持续时间总和
/** extradata array (and size) for multiple stsd */
uint8_t **extradata;
int *extradata_size;
int last_stsd_index;
int stsd_count;//stsd Entry个数
int stsd_version;//stsd 版本
int32_t *display_matrix;//视频矩阵
AVStereo3D *stereo3d;
AVSphericalMapping *spherical;
size_t spherical_size;
AVMasteringDisplayMetadata *mastering;
AVContentLightMetadata *coll;
size_t coll_size;
uint32_t format; //编码格式
int has_sidx; // If there is an sidx entry for this stream.
struct {
struct AVAESCTR* aes_ctr;
unsigned int per_sample_iv_size; // Either 0, 8, or 16.
AVEncryptionInfo *default_encrypted_sample;
MOVEncryptionIndex *encryption_index;
} cenc;
} MOVStreamContext;
解析(stsd)、(stts)、(stss)、(ctts)、(stsc)、(stsz)、(stco) 种类的 box的接口
重点小Tips:
这里所有的接口在开始的时候都会写:
AVStream *st;
MOVStreamContext *sc;
int ret, entries;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams - 1];
sc = st->priv_data;
这样写的好处是,接口有扩展性,所有接口的形参都是(MOVContext *c, AVIOContext *pb, MOVAtom atom),上面也说了,利用上一级AVStream 变量中的priv_data变量,赋值给mov模块自己的结构体MOVStreamContext ,方便书写和扩展,同理,原本最外层的AVFormatContext也被赋值到MOVContext 当中,传入接口方便书写,这样即使外部变量如命名有变化也会很少的影响内部接口。ffmpeg的接口大多都用到这种方式,尤其是涉及到一些外部协议rtmp流媒体、file文件、mov格式等。
所有的 metadata atom 统一的构成 version+flag+entry 数量
这里说一下entry 的概念:按照我的理解就是一种结构体概念,有点像atom或box的概念一样,用来存储具体的metadata 数据,每个atom有一个或多个entry,MOVStsc MOVStts MOVStsz 的变量数组下标为 Entry 的序号
stsd atom :sample 的metadata 信息读取
//stsd atom
static int mov_read_stsd(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
int ret, entries;
//外部参数赋值,方便书写和扩展
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams - 1];
sc = st->priv_data;
sc->stsd_version = avio_r8(pb); //版本号
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //entry 数量 (一般为1,音频或视频或subtitle)
/* Each entry contains a size (4 bytes) and format (4 bytes). */
if (entries <= 0 || entries > atom.size / 8) {
av_log(c->fc, AV_LOG_ERROR, "invalid STSD entries %d\n", entries);
return AVERROR_INVALIDDATA;
}
if (sc->extradata) {
av_log(c->fc, AV_LOG_ERROR,
"Duplicate stsd found in this track.\n");
return AVERROR_INVALIDDATA;
}
/* Prepare space for hosting multiple extradata. */
sc->extradata = av_mallocz_array(entries, sizeof(*sc->extradata));
if (!sc->extradata)
return AVERROR(ENOMEM);
sc->extradata_size = av_mallocz_array(entries, sizeof(*sc->extradata_size));
if (!sc->extradata_size) {
ret = AVERROR(ENOMEM);
goto fail;
}
//解析stsd的entry
ret = ff_mov_read_stsd_entries(c, pb, entries);
if (ret < 0)
goto fail;
/* Restore back the primary extradata. */
av_freep(&st->codecpar->extradata);
st->codecpar->extradata_size = sc->extradata_size[0];
if (sc->extradata_size[0]) {
st->codecpar->extradata = av_mallocz(sc->extradata_size[0] + AV_INPUT_BUFFER_PADDING_SIZE);
if (!st->codecpar->extradata)
return AVERROR(ENOMEM);
memcpy(st->codecpar->extradata, sc->extradata[0], sc->extradata_size[0]);
}
return mov_finalize_stsd_codec(c, pb, st, sc); //针对音频的
fail:
...
return ret;
}
//解析stsd的entry
int ff_mov_read_stsd_entries(MOVContext *c, AVIOContext *pb, int entries)
{
AVStream *st;
MOVStreamContext *sc;
int pseudo_stream_id;
av_assert0 (c->fc->nb_streams >= 1);
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
for (pseudo_stream_id = 0;
pseudo_stream_id < entries && !pb->eof_reached;
pseudo_stream_id++) {
//Parsing Sample description table
enum AVCodecID id;
int ret, dref_id = 1;
MOVAtom a = { AV_RL32("stsd") };
int64_t start_pos = avio_tell(pb);
int64_t size = avio_rb32(pb); /* size 大小*/
uint32_t format = avio_rl32(pb); /* 编码格式 AAC H.264等 */
if (size >= 16) {
avio_rb32(pb); /* reserved 保留位没什么意义*/
avio_rb16(pb); /* reserved 保留位没什么意义*/
dref_id = avio_rb16(pb);
} else if (size <= 7) {
av_log(c->fc, AV_LOG_ERROR,
"invalid size %"PRId64" in stsd\n", size);
return AVERROR_INVALIDDATA;
}
if (mov_skip_multiple_stsd(c, pb, st->codecpar->codec_tag, format,
size - (avio_tell(pb) - start_pos))) {
sc->stsd_count++;
continue;
}
//sc->pseudo_stream_id :stsd Entry数目一般为1
sc->pseudo_stream_id = st->codecpar->codec_tag ? -1 : pseudo_stream_id;
sc->dref_id= dref_id;
sc->format = format;
//找出文件编码格式所对应的的 id(通过这个id,可以找出对应的解码器)
id = mov_codec_id(st, format);
av_log(c->fc, AV_LOG_TRACE,
"size=%"PRId64" 4CC=%s codec_type=%d\n", size,
av_fourcc2str(format), st->codecpar->codec_type);
//赋值codecpar->codec_id和codecpar->codec_type
st->codecpar->codec_id = id;//赋值codecpar->codec_id和codecpar->codec_type
//视频
if (st->codecpar->codec_type==AVMEDIA_TYPE_VIDEO) {
mov_parse_stsd_video(c, pb, st, sc);
}
//音频
else if (st->codecpar->codec_type==AVMEDIA_TYPE_AUDIO) {
mov_parse_stsd_audio(c, pb, st, sc);
if (st->codecpar->sample_rate < 0) {
av_log(c->fc, AV_LOG_ERROR, "Invalid sample rate %d\n", st->codecpar->sample_rate);
return AVERROR_INVALIDDATA;
}
}
//subtitle(字幕)
else if (st->codecpar->codec_type==AVMEDIA_TYPE_SUBTITLE){
mov_parse_stsd_subtitle(c, pb, st, sc,
size - (avio_tell(pb) - start_pos));
} else {
ret = mov_parse_stsd_data(c, pb, st, sc,
size - (avio_tell(pb) - start_pos));
if (ret < 0)
return ret;
}
/* this will read extra atoms at the end (wave, alac, damr, avcC, hvcC, SMI ...) */
a.size = size - (avio_tell(pb) - start_pos);
if (a.size > 8) {
if ((ret = mov_read_default(c, pb, a)) < 0) //stsd atom还有剩余数据就继续往下读,这里应该是一些额外数据
return ret;
} else if (a.size > 0)
avio_skip(pb, a.size);
if (sc->extradata && st->codecpar->extradata) {
int extra_size = st->codecpar->extradata_size;
/* Move the current stream extradata to the stream context one. */
sc->extradata_size[pseudo_stream_id] = extra_size;
sc->extradata[pseudo_stream_id] = st->codecpar->extradata;
st->codecpar->extradata = NULL;
st->codecpar->extradata_size = 0;
}
sc->stsd_count++;
}
if (pb->eof_reached) { //文件读取出现错误
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSD atom\n");
return AVERROR_EOF;
}
return 0;
}
//解析视频Entry内容,读取的数据直接赋值给外部AVstream 结构中
static void mov_parse_stsd_video(MOVContext *c, AVIOContext *pb, AVStream *st, MOVStreamContext *sc)
{
uint8_t codec_name[32] = { 0 };
int64_t stsd_start;
unsigned int len;
/* The first 16 bytes of the video sample description are already
* read in ff_mov_read_stsd_entries() */
stsd_start = avio_tell(pb) - 16;
avio_rb16(pb); /* version */
avio_rb16(pb); /* revision level */
avio_rb32(pb); /* vendor */
avio_rb32(pb); /* temporal quality */
avio_rb32(pb); /* spatial quality */
st->codecpar->width = avio_rb16(pb); /* width */
st->codecpar->height = avio_rb16(pb); /* height */
avio_rb32(pb); /* horiz resolution */
avio_rb32(pb); /* vert resolution */
avio_rb32(pb); /* data size, always 0 */
avio_rb16(pb); /* frames per samples */
len = avio_r8(pb); /* codec name, pascal string */
if (len > 31)
len = 31;
mov_read_mac_string(c, pb, len, codec_name, sizeof(codec_name));
if (len < 31)
avio_skip(pb, 31 - len);
if (codec_name[0])
av_dict_set(&st->metadata, "encoder", codec_name, 0);
/* codec_tag YV12 triggers an UV swap in rawdec.c */
if (!strncmp(codec_name, "Planar Y'CbCr 8-bit 4:2:0", 25)) {
st->codecpar->codec_tag = MKTAG('I', '4', '2', '0');
st->codecpar->width &= ~1;
st->codecpar->height &= ~1;
}
/* Flash Media Server uses tag H.263 with Sorenson Spark */
if (st->codecpar->codec_tag == MKTAG('H','2','6','3') &&
!strncmp(codec_name, "Sorenson H263", 13))
st->codecpar->codec_id = AV_CODEC_ID_FLV1;
st->codecpar->bits_per_coded_sample = avio_rb16(pb); /* depth */
avio_seek(pb, stsd_start, SEEK_SET);
//QuickTime 格式需要设置调色板,h.264不需要这里 h.264 color depth 为32
if (ff_get_qtpalette(st->codecpar->codec_id, pb, sc->palette)) {
st->codecpar->bits_per_coded_sample &= 0x1F;
sc->has_palette = 1;
}
}
//解析音频Entry内容,读取的数据直接赋值给外部AVstream 结构中
static void mov_parse_stsd_audio(MOVContext *c, AVIOContext *pb, AVStream *st, MOVStreamContext *sc)
{
int bits_per_sample, flags;
uint16_t version = avio_rb16(pb);
AVDictionaryEntry *compatible_brands = av_dict_get(c->fc->metadata, "compatible_brands", NULL, AV_DICT_MATCH_CASE);
avio_rb16(pb); /* revision level */
avio_rb32(pb); /* vendor */
st->codecpar->channels = avio_rb16(pb); /* channel count */
st->codecpar->bits_per_coded_sample = avio_rb16(pb); /* sample size */
av_log(c->fc, AV_LOG_TRACE, "audio channels %d\n", st->codecpar->channels);
sc->audio_cid = avio_rb16(pb);
avio_rb16(pb); /* packet size = 0 */
st->codecpar->sample_rate = ((avio_rb32(pb) >> 16));
// Read QuickTime 格式
if (!c->isom ||
(compatible_brands && strstr(compatible_brands->value, "qt ")) ||
(sc->stsd_version == 0 && version > 0)) {
if (version == 1) {
sc->samples_per_frame = avio_rb32(pb);
avio_rb32(pb); /* bytes per packet */
sc->bytes_per_frame = avio_rb32(pb);
avio_rb32(pb); /* bytes per sample */
} else if (version == 2) {
avio_rb32(pb); /* sizeof struct only */
st->codecpar->sample_rate = av_int2double(avio_rb64(pb));
st->codecpar->channels = avio_rb32(pb);
avio_rb32(pb); /* always 0x7F000000 */
st->codecpar->bits_per_coded_sample = avio_rb32(pb);
flags = avio_rb32(pb); /* lpcm format specific flag */
sc->bytes_per_frame = avio_rb32(pb);
sc->samples_per_frame = avio_rb32(pb);
if (st->codecpar->codec_tag == MKTAG('l','p','c','m'))
st->codecpar->codec_id =
ff_mov_get_lpcm_codec_id(st->codecpar->bits_per_coded_sample,
flags);
}
if (version == 0 || (version == 1 && sc->audio_cid != -2)) {
/* can't correctly handle variable sized packet as audio unit */
switch (st->codecpar->codec_id) {
case AV_CODEC_ID_MP2:
case AV_CODEC_ID_MP3:
st->need_parsing = AVSTREAM_PARSE_FULL;
break;
}
}
}
...
switch (st->codecpar->codec_id) {
... 这里的代码一般用不上
default:
break;
}
bits_per_sample = av_get_bits_per_sample(st->codecpar->codec_id);
if (bits_per_sample) {
st->codecpar->bits_per_coded_sample = bits_per_sample;
sc->sample_size = (bits_per_sample >> 3) * st->codecpar->channels;
}
}
stts atom :sample 的dts 信息读取
小Tips : 这里有个重要的地方:在stts读取的每个sample的dts值 + ctts_data[i]->duration=pts
即:dts+duration(偏差值)= pts,(这个会在mov_read_packet接口中计算,之后直接赋值给AVPacket)但
如果没有ctts, 那么dts==pts
typedef struct MOVStts {
unsigned int count; //相同duration的sample数量
int duration; //每个sample的dts的偏差值 也就是 delta增量
} MOVStts;
static int mov_read_stts(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries, alloc_size = 0;
int64_t duration = 0;//总的显示时间
int64_t total_sample_count = 0;//总的帧数(样本数)
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //Entry 个数(一般情况下为1)
av_log(c->fc, AV_LOG_TRACE, "track[%u].stts.entries = %u\n",
c->fc->nb_streams-1, entries);
if (sc->stts_data)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STTS atom\n");
av_freep(&sc->stts_data);
sc->stts_count = 0;//Entry 个数(一般情况下为1)
if (entries >= INT_MAX / sizeof(*sc->stts_data))
return AVERROR(ENOMEM);
for (i = 0; i < entries && !pb->eof_reached; i++) {
int sample_duration;
unsigned int sample_count;
unsigned int min_entries = FFMIN(FFMAX(i + 1, 1024 * 1024), entries);
//开辟内存
MOVStts *stts_data = av_fast_realloc(sc->stts_data, &alloc_size,
min_entries * sizeof(*sc->stts_data));
if (!stts_data) {
av_freep(&sc->stts_data);
sc->stts_count = 0;
return AVERROR(ENOMEM);
}
sc->stts_count = min_entries;
sc->stts_data = stts_data;
sample_count = avio_rb32(pb);
sample_duration = avio_rb32(pb);
//i为Entry序号
sc->stts_data[i].count= sample_count; //相同duration的sample数量
sc->stts_data[i].duration= sample_duration; //每个sample的dts的偏差值 也就是 delta增量
av_log(c->fc, AV_LOG_TRACE, "sample_count=%d, sample_duration=%d\n",
sample_count, sample_duration);
duration+=(int64_t)sample_duration*(uint64_t)sample_count; //总的显示时间
total_sample_count+=sample_count;//总的样本数
}
sc->stts_count = i; //stts Enntry 个数
if (duration > 0 &&
duration <= INT64_MAX - sc->duration_for_fps &&
total_sample_count <= INT_MAX - sc->nb_frames_for_fps) {
sc->duration_for_fps += duration;
sc->nb_frames_for_fps += total_sample_count;
}
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STTS atom\n");
return AVERROR_EOF;
}
st->nb_frames= total_sample_count;
if (duration)
st->duration= FFMIN(st->duration, duration);总的显示时间
sc->track_end = duration;//帧结尾位置(也就是duration时间总数)
return 0;
}
stss atom :sample 关键帧 信息读取
static int mov_read_stss(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //Enry 数量 (一般为1)
av_log(c->fc, AV_LOG_TRACE, "keyframe_count = %u\n", entries);
if (!entries) {
sc->keyframe_absent = 1;
if (!st->need_parsing && st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO)
st->need_parsing = AVSTREAM_PARSE_HEADERS;
return 0;
}
if (sc->keyframes)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STSS atom\n");
if (entries >= UINT_MAX / sizeof(int))
return AVERROR_INVALIDDATA;
av_freep(&sc->keyframes);
sc->keyframe_count = 0;//关键帧个数
sc->keyframes = av_malloc_array(entries, sizeof(*sc->keyframes));//开辟内存,关键帧以int*方式存储
if (!sc->keyframes)
return AVERROR(ENOMEM);
for (i = 0; i < entries && !pb->eof_reached; i++) {
sc->keyframes[i] = avio_rb32(pb); //给keyframes赋值(每一个数组元素都是关键帧序号)
}
sc->keyframe_count = i; //关键帧个数
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSS atom\n");
return AVERROR_EOF;
}
return 0;
}
ctts atom :sample 的dts和pts 的偏差值 信息读取
小Tips : 这里有个重要的地方:在stts读取的每个sample的dts值 + ctts_data[i]->duration=pts
即:dts+duration(偏差值)= pts,(这个会在mov_read_packet接口中计算,之后直接赋值给AVPacket)但
如果没有ctts, 那么dts==pts
typedef struct MOVStts {
unsigned int count; //相同duration的sample数量
int duration; //每个sample的dts的偏差值 也就是 delta增量
} MOVStts;
static int mov_read_ctts(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries, ctts_count = 0;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb);//Entry 个数
av_log(c->fc, AV_LOG_TRACE, "track[%u].ctts.entries = %u\n", c->fc->nb_streams - 1, entries);
if (!entries)
return 0;
if (entries >= UINT_MAX / sizeof(*sc->ctts_data))
return AVERROR_INVALIDDATA;
av_freep(&sc->ctts_data);
//开辟内存
sc->ctts_data = av_fast_realloc(NULL, &sc->ctts_allocated_size, entries * sizeof(*sc->ctts_data));
if (!sc->ctts_data)
return AVERROR(ENOMEM);
for (i = 0; i < entries && !pb->eof_reached; i++) {
int count = avio_rb32(pb);//相同duration的sample数量
int duration = avio_rb32(pb);//每个sample的dts的偏差值 也就是 delta增量
if (count <= 0) {
av_log(c->fc, AV_LOG_TRACE,
"ignoring CTTS entry with count=%d duration=%d\n",
count, duration);
continue;
}
//给 ctts_data赋值
add_ctts_entry(&sc->ctts_data, &ctts_count, &sc->ctts_allocated_size,
count, duration);
av_log(c->fc, AV_LOG_TRACE, "count=%d, duration=%d\n",
count, duration);
if (FFNABS(duration) < -(1<<28) && i+2<entries) {
av_log(c->fc, AV_LOG_WARNING, "CTTS invalid\n");
av_freep(&sc->ctts_data);
sc->ctts_count = 0;
return 0;
}
if (i+2<entries)
//sc->dts_shift一般为0
mov_update_dts_shift(sc, duration, c->fc);
}
sc->ctts_count = ctts_count; //ctts Entry个数
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted CTTS atom\n");
return AVERROR_EOF;
}
av_log(c->fc, AV_LOG_TRACE, "dts shift %d\n", sc->dts_shift);
return 0;
}
stsc atom :sample chunk 序号 信息读取
小Tips : mov_build_index接口通过stsz每个sample大小 + stsc 每个chunk中sample数量 + stco 每个chunk相对整个文件的绝对偏移量得出每个sample相对整个文件的绝对偏移量
typedef struct MOVStsc {
int first;//chunk 中的第一个sample的id (一个chunk中有一个或多个sample)
int count;//每个chunk中sample数量
int id; //Sample description 一般为1
} MOVStsc;
static int mov_read_stsc(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb); //Entry 个数
if ((uint64_t)entries * 12 + 4 > atom.size)
return AVERROR_INVALIDDATA;
av_log(c->fc, AV_LOG_TRACE, "track[%u].stsc.entries = %u\n", c->fc->nb_streams - 1, entries);
if (!entries)
return 0;
if (sc->stsc_data)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STSC atom\n");
av_free(sc->stsc_data);
sc->stsc_count = 0;
//开辟内存
sc->stsc_data = av_malloc_array(entries, sizeof(*sc->stsc_data));
if (!sc->stsc_data)
return AVERROR(ENOMEM);
//sc->stsc_data赋值
for (i = 0; i < entries && !pb->eof_reached; i++) {
sc->stsc_data[i].first = avio_rb32(pb);
sc->stsc_data[i].count = avio_rb32(pb);
sc->stsc_data[i].id = avio_rb32(pb);
}
sc->stsc_count = i;//Entry 个数
for (i = sc->stsc_count - 1; i < UINT_MAX; i--) {
int64_t first_min = i + 1;
if ((i+1 < sc->stsc_count && sc->stsc_data[i].first >= sc->stsc_data[i+1].first) ||
(i > 0 && sc->stsc_data[i].first <= sc->stsc_data[i-1].first) ||
sc->stsc_data[i].first < first_min ||
sc->stsc_data[i].count < 1 ||
sc->stsc_data[i].id < 1) {
av_log(c->fc, AV_LOG_WARNING, "STSC entry %d is invalid (first=%d count=%d id=%d)\n", i, sc->stsc_data[i].first, sc->stsc_data[i].count, sc->stsc_data[i].id);
if (i+1 >= sc->stsc_count) {
if (sc->stsc_data[i].count == 0 && i > 0) {
sc->stsc_count --;
continue;
}
sc->stsc_data[i].first = FFMAX(sc->stsc_data[i].first, first_min);
if (i > 0 && sc->stsc_data[i].first <= sc->stsc_data[i-1].first)
sc->stsc_data[i].first = FFMIN(sc->stsc_data[i-1].first + 1LL, INT_MAX);
sc->stsc_data[i].count = FFMAX(sc->stsc_data[i].count, 1);
sc->stsc_data[i].id = FFMAX(sc->stsc_data[i].id, 1);
continue;
}
av_assert0(sc->stsc_data[i+1].first >= 2);
// We replace this entry by the next valid
sc->stsc_data[i].first = sc->stsc_data[i+1].first - 1;
sc->stsc_data[i].count = sc->stsc_data[i+1].count;
sc->stsc_data[i].id = sc->stsc_data[i+1].id;
}
}
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSC atom\n");
return AVERROR_EOF;
}
return 0;
}
stsz atom : 每个sample 大小 信息读取
小Tips : mov_build_index接口通过stsz每个sample大小 + stsc 每个chunk中sample数量 + stco 每个chunk相对整个文件的绝对偏移量得出每个sample相对整个文件的绝对偏移量
static int mov_read_stsz(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries, sample_size, field_size, num_bytes;
GetBitContext gb;
unsigned char* buf;
int ret;
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
if (atom.type == MKTAG('s','t','s','z')) {
sample_size = avio_rb32(pb); //sample总数
if (!sc->sample_size) /* do not overwrite value computed in stsd */
sc->sample_size = sample_size;
sc->stsz_sample_size = sample_size;
field_size = 32;
} else {
sample_size = 0;
avio_rb24(pb); /* reserved */
field_size = avio_r8(pb);
}
entries = avio_rb32(pb);//Entry 个数(一般是sample总数)
av_log(c->fc, AV_LOG_TRACE, "sample_size = %u sample_count = %u\n", sc->sample_size, entries);
sc->sample_count = entries;//Entry 个数(一般是sample总数)
if (sample_size)
return 0;
if (field_size != 4 && field_size != 8 && field_size != 16 && field_size != 32) {
av_log(c->fc, AV_LOG_ERROR, "Invalid sample field size %u\n", field_size);
return AVERROR_INVALIDDATA;
}
if (!entries)
return 0;
if (entries >= (UINT_MAX - 4) / field_size)
return AVERROR_INVALIDDATA;
if (sc->sample_sizes)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STSZ atom\n");
av_free(sc->sample_sizes);
sc->sample_count = 0;
//开辟内存
sc->sample_sizes = av_malloc_array(entries, sizeof(*sc->sample_sizes));
if (!sc->sample_sizes)
return AVERROR(ENOMEM);
num_bytes = (entries*field_size+4)>>3;
buf = av_malloc(num_bytes+AV_INPUT_BUFFER_PADDING_SIZE);
if (!buf) {
av_freep(&sc->sample_sizes);
return AVERROR(ENOMEM);
}
ret = ffio_read_size(pb, buf, num_bytes);
if (ret < 0) {
av_freep(&sc->sample_sizes);
av_free(buf);
av_log(c->fc, AV_LOG_WARNING, "STSZ atom truncated\n");
return 0;
}
init_get_bits(&gb, buf, 8*num_bytes);
//sc->sample_sizes赋值,int*形式表示每帧大小
for (i = 0; i < entries && !pb->eof_reached; i++) {
sc->sample_sizes[i] = get_bits_long(&gb, field_size);
if (sc->sample_sizes[i] < 0) {
av_free(buf);
av_log(c->fc, AV_LOG_ERROR, "Invalid sample size %d\n", sc->sample_sizes[i]);
return AVERROR_INVALIDDATA;
}
sc->data_size += sc->sample_sizes[i];//sample总体大小
}
sc->sample_count = i;
av_free(buf);
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STSZ atom\n");
return AVERROR_EOF;
}
return 0;
}
stco atom : 每个chunk相对整个文件的绝对偏移量 信息读取 (为了不依靠其他参数寻找每个sample的位置)
小Tips : mov_build_index接口通过stsz每个sample大小 + stsc 每个chunk中sample数量 + stco 每个chunk相对整个文件的绝对偏移量得出每个sample相对整个文件的绝对偏移量
static int mov_read_stco(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
AVStream *st;
MOVStreamContext *sc;
unsigned int i, entries;
if (c->trak_index < 0) {
av_log(c->fc, AV_LOG_WARNING, "STCO outside TRAK\n");
return 0;
}
if (c->fc->nb_streams < 1)
return 0;
st = c->fc->streams[c->fc->nb_streams-1];
sc = st->priv_data;
avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
entries = avio_rb32(pb);//Entry 个数(chunk总数)
if (!entries)
return 0;
if (sc->chunk_offsets)
av_log(c->fc, AV_LOG_WARNING, "Duplicated STCO atom\n");
av_free(sc->chunk_offsets);
sc->chunk_count = 0;
//开辟内存
sc->chunk_offsets = av_malloc_array(entries, sizeof(*sc->chunk_offsets));
if (!sc->chunk_offsets)
return AVERROR(ENOMEM);
sc->chunk_count = entries;
//chunk_offsets 以int*形式赋值
if (atom.type == MKTAG('s','t','c','o'))
for (i = 0; i < entries && !pb->eof_reached; i++)
sc->chunk_offsets[i] = avio_rb32(pb);
else if (atom.type == MKTAG('c','o','6','4'))
for (i = 0; i < entries && !pb->eof_reached; i++)
sc->chunk_offsets[i] = avio_rb64(pb);
else
return AVERROR_INVALIDDATA;
sc->chunk_count = i;//Entry 个数(chunk总数)
if (pb->eof_reached) {
av_log(c->fc, AV_LOG_WARNING, "reached eof, corrupted STCO atom\n");
return AVERROR_EOF;
}
return 0;
}
elst atom 第一个sample的dts信息读取
小Tips : mov_build_index接口计算,取MOVElst 中 time的负数就是dts的第一个值,所以一般第一个sample的dts为负数。再通过stts中的duration增量值(偏差值)得出每个sample的dts,再通过ctts中的duration增量值(偏差值)进而得出pts
typedef struct MOVElst {
int64_t duration;//sample的总时间
int64_t time;//sample的dts起始值(取time的负数就是dts的第一个值)
float rate; //sample rate 一般为1
} MOVElst;
static int mov_read_elst(MOVContext *c, AVIOContext *pb, MOVAtom atom)
{
MOVStreamContext *sc;
int i, edit_count, version;
int64_t elst_entry_size;
if (c->fc->nb_streams < 1 || c->ignore_editlist)
return 0;
sc = c->fc->streams[c->fc->nb_streams-1]->priv_data;
version = avio_r8(pb); /* version */
avio_rb24(pb); /* flags */
edit_count = avio_rb32(pb); /* entries 一般为1*/
atom.size -= 8;
elst_entry_size = version == 1 ? 20 : 12;
if (atom.size != edit_count * elst_entry_size) {
if (c->fc->strict_std_compliance >= FF_COMPLIANCE_STRICT) {
av_log(c->fc, AV_LOG_ERROR, "Invalid edit list entry_count: %d for elst atom of size: %"PRId64" bytes.\n",
edit_count, atom.size + 8);
return AVERROR_INVALIDDATA;
} else {
edit_count = atom.size / elst_entry_size;
if (edit_count * elst_entry_size != atom.size) {
av_log(c->fc, AV_LOG_WARNING, "ELST atom of %"PRId64" bytes, bigger than %d entries.\n", atom.size, edit_count);
}
}
}
if (!edit_count)
return 0;
if (sc->elst_data)
av_log(c->fc, AV_LOG_WARNING, "Duplicated ELST atom\n");
av_free(sc->elst_data);
sc->elst_count = 0;
//开辟内存
sc->elst_data = av_malloc_array(edit_count, sizeof(*sc->elst_data));
if (!sc->elst_data)
return AVERROR(ENOMEM);
//elst_data赋值
for (i = 0; i < edit_count && atom.size > 0 && !pb->eof_reached; i++) {
MOVElst *e = &sc->elst_data[i];
if (version == 1) {
e->duration = avio_rb64(pb);
e->time = avio_rb64(pb);
atom.size -= 16;
} else {
e->duration = avio_rb32(pb); /* segment duration */
e->time = (int32_t)avio_rb32(pb); /* media time */
atom.size -= 8;
}
e->rate = avio_rb32(pb) / 65536.0;
atom.size -= 4;
av_log(c->fc, AV_LOG_TRACE, "duration=%"PRId64" time=%"PRId64" rate=%f\n",
e->duration, e->time, e->rate);
if (e->time < 0 && e->time != -1 &&
c->fc->strict_std_compliance >= FF_COMPLIANCE_STRICT) {
av_log(c->fc, AV_LOG_ERROR, "Track %d, edit %d: Invalid edit list media time=%"PRId64"\n",
c->fc->nb_streams-1, i, e->time);
return AVERROR_INVALIDDATA;
}
}
sc->elst_count = i;
return 0;
}
主要Atom 接口都写完了,通过这些metadata数据,就能在mov_build_index接口中赋值给AVIndexEntry 变量,从而在mov_read_packet接口中,通过sample的绝对位置和大小在文件中取音视频数据到AVPacket中,进而调用解码器进行解码。