MP4文件sample读取流程
(2013-01-01 15:33:45)
前面两篇博客描述了MP4文件的boxer组成结构,各个boxer的含义和包含的文件信息;另外也描述了stts, stco, stsz, stsc, ctts等boxer是如何建立一张张用于查找各个sample具体位置,sample大小,时间和sampleIndex的对应关系的表,通过这些表,我们将一个MP4文件的所有的sample有机的组成在了一起,可以任意读取文件按特定sample或者特定时刻的内容。
这篇博客是建立在上几篇博客的基础上,我们想分析一个MP4文件已经解析完成后,如何利用这些表在文件中找到我们需要的sample的offset和size,然后读取送到解码器解码播放的,这里我们以Android libstagefright库中的MPEG4Extractor.cpp类为例。
在类MPEG4Extractor中,从文件中读取具体sample的过程是在函数read中完成,由于真实的场景的复杂性,读取sample的可能性多种多样,如从前往后播放是按照sampleIndex的顺序从前往后读;我们也可能随意拖动进度条,做seek操作,直接需要读取到某一特定时间点。但是,无论哪种场景最后流程大致一致,通过sampleIndex获取信息(offset, size, cts, isSyncFrame),所以这里我们截取read函数的一种情况,拿出来分析,让我们明白如何在一个MP4文件中找到我们所需要的所有信息。
截取出来的部分关键代码如下,以红色部分标出:
status_t MPEG4Source::read(
MediaBuffer **out, const ReadOptions *options) {
Mutex::Autolock autoLock(mLock);
*out = NULL;
int64_t targetSampleTimeUs = -1;
off64_t offset;
size_t size;
uint32_t cts;
bool isSyncSample;
status_t err = mSampleTable->getMetaDataForSample
(mCurrentSampleIndex, &offset, &size, &cts, &isSyncSample);
err = mGroup->acquire_buffer(&mBuffer);
ssize_t num_bytes_read =mDataSource->readAt(offset, (uint8_t *)mBuffer->data(), size);
mBuffer->set_range(0, size);
mBuffer->meta_data()->clear();
mBuffer->meta_data()->setInt64(kKeyTime, ((int64_t)cts * 1000000) / mTimescale);
if (targetSampleTimeUs >= 0) {
mBuffer->meta_data()->setInt64(kKeyTargetTime, targetSampleTimeUs);
}
if (isSyncSample) {
mBuffer->meta_data()->setInt32(kKeyIsSyncFrame, 1);
}
++mCurrentSampleIndex;
*out = mBuffer;
mBuffer = NULL;
return OK;
}
最关键的函数就是
getMetaDataForSample
这个函数通过sampleIndex获取到该sample的offset,size和isSyncFrame等信息,有了offset和size就已经将这个sample的内容定位了,直接调用readAt读取即可,读取完会后将
mCurrentSampleIndex加一,表示下一个将要读取的sample的索引。
要知道如何通过sampleIndex获得这些信息,就需要继续跟踪函数
getMetaDataForSample
status_t SampleTable::getMetaDataForSample(
uint32_t sampleIndex,
off64_t *offset,
size_t *size,
uint32_t *compositionTime,
bool *isSyncSample) {
Mutex::Autolock autoLock(mLock);
status_t err;
if ((err =
mSampleIterator->seekTo(sampleIndex)) != OK) {
return err;
}
if (offset) {
*offset =
mSampleIterator->getSampleOffset();
}
if (size) {
*size =
mSampleIterator->getSampleSize();
}
if (compositionTime) {
*compositionTime =
mSampleIterator->getSampleTime();
}
if (isSyncSample) {
*isSyncSample = false;
if (mSyncSampleOffset < 0) {
// Every sample is a sync sample.
*isSyncSample = true;
} else {
size_t i = (mLastSyncSampleIndex < mNumSyncSamples)
&& (mSyncSamples[mLastSyncSampleIndex] <= sampleIndex)
? mLastSyncSampleIndex : 0;
while (i < mNumSyncSamples && mSyncSamples[i] < sampleIndex) {
++i;
}
if (i < mNumSyncSamples && mSyncSamples[i] == sampleIndex) {
*isSyncSample = true;
}
mLastSyncSampleIndex = i;
}
}
return OK;
}
蓝色部分是通过查询关键帧表,判断当前的sample是不是关键帧,if分支里面通过
mSyncSampleOffset < 0判断MP4文件中是否提供了一个stss的关键帧表,如果没有提供的话就默认所有的帧都是关键帧,如果提供了stss的关键帧表,则进入下面的else分支进行查询,通过前面的一篇博客对于关键帧boxer的解析过程,我们知道在SampleTable::setSyncSampleParams函数中记录三个东西:
mSyncSampleOffset = data_offset; sync boxer表的数据部分起始位置偏移
mNumSyncSamples = U32_AT(&header[4]);第五到八的四字节表示关键帧的数目,每个关键帧的序列号用四个字节表示
mSyncSamples = new uint32_t[mNumSyncSamples];
mDataSource->readAt(mSyncSampleOffset + 8, mSyncSamples, size)
用数组mSyncSamples记录每一个帧的序列号,即sampleIndex,我们这里就是通过后面两个变量判断当前帧是否是关键帧,算法如下:
mLastSyncSampleIndex是数组mSyncSamples中的游标,用来记录该数组中上次被取出的关键帧的个数,从mLastSyncSampleIndex开始计数,遍历数组,直到mSyncSamples[i] = sampleIndex(当前帧为关键帧的情况),或者数组遍历完没有找到等于sampleIndex的情况(当前帧不是关键帧)
当然大前提,i必须少于关键帧的总数这个条件一直都必须满足。
分析红色部分的代码,首先看后三个函数的实现:
off64_t getSampleOffset() const { return mCurrentSampleOffset; }
size_t getSampleSize() const { return mCurrentSampleSize; }
uint32_t getSampleTime() const { return mCurrentSampleTime; }
这三个函数相当简单,直接在头文件声明和定义,返回三个变量的值,没有任何逻辑操作,所以这三个变量肯定是在其他地方设置值的,也就是在上面的seekTo函数,这个函数非常关键,通过sampleIndex,设置各个变量的值,继续看seekTo的实现
status_t SampleIterator::seekTo(uint32_t sampleIndex) {
ALOGV("seekTo(%d)", sampleIndex);
if (sampleIndex >= mTable->mNumSampleSizes) {
return ERROR_END_OF_STREAM;
}
检查各个表有没有被初始化
if (mTable->mSampleToChunkOffset < 0
|| mTable->mChunkOffsetOffset < 0
|| mTable->mSampleSizeOffset < 0
|| mTable->mTimeToSampleCount == 0) {
return ERROR_MALFORMED;
}
if (mInitialized && mCurrentSampleIndex == sampleIndex) {
return OK;
}
if (!mInitialized || sampleIndex < mFirstChunkSampleIndex) {
reset();
}
if (sampleIndex >= mStopChunkSampleIndex) {
status_t err;
if ((err = findChunkRange(sampleIndex)) != OK) {
ALOGE("findChunkRange failed");
return err;
}
}
CHECK(sampleIndex < mStopChunkSampleIndex);
uint32_t chunk =
(sampleIndex - mFirstChunkSampleIndex) / mSamplesPerChunk
+ mFirstChunk;
if (!mInitialized || chunk != mCurrentChunkIndex) {
mCurrentChunkIndex = chunk;
status_t err;
if ((err = getChunkOffset(chunk, &mCurrentChunkOffset)) != OK) {
ALOGE("getChunkOffset return error");
return err;
}
mCurrentChunkSampleSizes
.clear();
uint32_t firstChunkSampleIndex =
mFirstChunkSampleIndex
+ mSamplesPerChunk * (mCurrentChunkIndex - mFirstChunk);
for (uint32_t i = 0; i < mSamplesPerChunk; ++i) {
size_t sampleSize;
if ((err = getSampleSizeDirect(
firstChunkSampleIndex + i, &sampleSize)) != OK) {
ALOGE("getSampleSizeDirect return error");
return err;
}
mCurrentChunkSampleSizes
.push(sampleSize);
}
}
uint32_t chunkRelativeSampleIndex
=
(sampleIndex - mFirstChunkSampleIndex) % mSamplesPerChunk;
mCurrentSampleOffset = mCurrentChunkOffset;
for (uint32_t i = 0; i < chunkRelativeSampleIndex
; ++i) {
mCurrentSampleOffset += mCurrentChunkSampleSizes
[i];
}
mCurrentSampleSize = mCurrentChunkSampleSizes
[chunkRelativeSampleIndex
];
if (sampleIndex < mTTSSampleIndex) {
mTimeToSampleIndex = 0;
mTTSSampleIndex = 0;
mTTSSampleTime = 0;
mTTSCount = 0;
mTTSDuration = 0;
}
status_t err;
if ((err = findSampleTime(sampleIndex, &mCurrentSampleTime)) != OK) {
ALOGE("findSampleTime return error");
return err;
}
mCurrentSampleIndex = sampleIndex;
mInitialized = true;
return OK;
}
蓝色部分为入口参数的检查,这里直接忽略,全当正常情况,分析主要逻辑,由前面知,我们需要三个变量:
mCurrentSampleSize
mCurrentSampleTime
mCurrentSampleOffset
我们看这三个变量的赋值过程:
公式一:计算当前sample所在的chunk
sampleIndex - mFirstChunkSampleIndex
uint32_t chunk =
-----------------------------------------------
+ mFirstChunk
mSamplesPerChunk
公式二:查表计算当前chunk的偏移地址getChunkOffset(chunk, &mCurrentChunkOffset);
(以四字节表示的chunk offset为例)
mCurrentChunkOffset =
ntohl(mTable->mDataSource->readAt(
mTable->mChunkOffsetOffset +
8 + 4 * chunk,
&offset32,
sizeof(offset32));
公式三:计算当前chunk的第一个sample的索引号firstChunkSampleIndex
firstChunkSampleIndex =
mFirstChunkSampleIndex
+
mSamplesPerChunk * (
chunk - mFirstChunk);
公式四:计算这个chunk中每一个sample的大小,按照顺序放到数组mCurrentChunkSampleSizes中
for (uint32_t i = 0; i <
mSamplesPerChunk; ++i) {
size_t sampleSize;
if ((err = getSampleSizeDirect(
firstChunkSampleIndex + i, &sampleSize)) != OK) {
ALOGE("getSampleSizeDirect return error");
return err;
}
mCurrentChunkSampleSizes.push(sampleSize);
}
其中计算sample的大小使用公式五
公式五:计算任意一个sampleIndex的size
(以其中的一种情况为例)
size = ntohl(mTable->mDataSource->readAt(mTable->mSampleSizeOffset + 12 + 4
* sampleIndex,
size, sizeof(*size));
公式六:计算sample在当前chunk中是第几个sample
(这里应当有好几种方法)
uint32_t chunkRelativeSampleIndex
=
(
sampleIndex - mFirstChunkSampleIndex) % mSamplesPerChunk;
有了这些就可以计算上面三个变量中的两个了:
从sample大小的数组中取出当前sample的大小
mCurrentSampleSize = mCurrentChunkSampleSizes
[chunkRelativeSampleIndex
];
当前sample的offset等于chunk的offset加上在这个chunk中位于这个sample之前的各个sample的大小和
mCurrentSampleOffset =
mCurrentSampleOffset;
for (uint32_t i = 0; i < chunkRelativeSampleIndex
; ++i) {
mCurrentSampleOffset += mCurrentChunkSampleSizes
[i];
}
变量mCurrentSampleTime和上面的处理无关,直接通过查找stts表得到:
status_t SampleIterator::findSampleTime(
uint32_t sampleIndex, uint32_t *time) {
if (sampleIndex >= mTable->mNumSampleSizes) {
return ERROR_OUT_OF_RANGE;
}
while (sampleIndex >= mTTSSampleIndex + mTTSCount) {
if (mTimeToSampleIndex == mTable->mTimeToSampleCount) {
return ERROR_OUT_OF_RANGE;
}
mTTSSampleIndex += mTTSCount;
mTTSSampleTime += mTTSCount * mTTSDuration;
mTTSCount = mTable->mTimeToSample[2 * mTimeToSampleIndex];
mTTSDuration = mTable->mTimeToSample[2 * mTimeToSampleIndex + 1];
++mTimeToSampleIndex;
}
*time = mTTSSampleTime + mTTSDuration * (sampleIndex - mTTSSampleIndex);
*time += mTable->getCompositionTimeOffset
(sampleIndex);
return OK;
}
分析上面这个函数之前,我们再次看下stts这个boxer解析的过程:
case FOURCC('s', 't', 't', 's'):
{
status_t err =
mLastTrack->sampleTable->setTimeToSampleParams(
data_offset, chunk_data_size);
*offset += chunk_size;
break;
}
status_t SampleTable::setTimeToSampleParams(
off64_t data_offset, size_t data_size) {
一个MP4文件中只能存在一个stts boxer,且其数据部分长度至少为8字节
if (mTimeToSample != NULL || data_size < 8) {
return ERROR_MALFORMED;
}
uint8_t header[8];
if (mDataSource->readAt(
data_offset, header, sizeof(header)) < (ssize_t)sizeof(header)) {
return ERROR_IO;
}
if (U32_AT(header) != 0) { 前四个字节表示这个stts boxer的版本,为零
// Expected version = 0, flags = 0.
return ERROR_MALFORMED;
}
mTimeToSampleCount = U32_AT(&header[4]);
stts表中记录的time 2 sample的个数,每个记录占8个字节,分为两部分,每部分占用四个字节
mTimeToSample = new uint32_t[mTimeToSampleCount * 2];
建立一个数组,保存这些记录,至于为什么要乘以2可以参照上面的分析,因为每条记录占用八个字节,但是这八个字节分为两部分:前四字节表示sample的个数,后四字节表示这几个sample的时长,例如:
---------------------------------------------------------------------------------------
我要计算第五个记录的时长 duration = mTimeToSample[2*4] * mTimeToSample[2*4+1]
时长 = 数组中第五个记录对应的sample的个数(mTTSCount) * 每个sample的长度(mTTSDuration)
---------------------------------------------------------------------------------------
size_t size = sizeof(uint32_t) * mTimeToSampleCount * 2;
if (mDataSource->readAt(
data_offset + 8, mTimeToSample, size) < (ssize_t)size) {
return ERROR_IO;
}
for (uint32_t i = 0; i < mTimeToSampleCount * 2; ++i) {
mTimeToSample[i] = ntohl(mTimeToSample[i]);
}
return OK;
}
分析了setTimeToSampleParams函数之后,在来分析findSampleTime就非常清晰了,算法如下:
先看下以下几个变量的含义:
mTTSSampleIndex:stts表中遍历到当前sampleIndex所在的记录前的所有记录sample数目之和;
mTTSCount:stts表中每条记录中包含的sample的个数,后四个字节表示;
mTTSDuration:stts表中每条记录中每个sample的时长
mTTSSampleTime:stts表中每条记录占用的时长之和,求和符号(mTTSCount[i]*mTTSSampleTime[i]);
mTimeToSampleIndex:stts表中数组mTimeToSample内部的游标,用来记录当前定位到第几个stts记录;
while (sampleIndex >= mTTSSampleIndex + mTTSCount) {
if (mTimeToSampleIndex == mTable->mTimeToSampleCount) {
return ERROR_OUT_OF_RANGE;
}
mTTSSampleIndex += mTTSCount;
mTTSSampleTime += mTTSCount * mTTSDuration;
mTTSCount = mTable->mTimeToSample[2 * mTimeToSampleIndex];
mTTSDuration = mTable->mTimeToSample[2 * mTimeToSampleIndex + 1];
++mTimeToSampleIndex;
}
遍历所有的stts记录(其实是到上一次遍历到的记录),并求mTTSSampleTime的时间,
*time = mTTSSampleTime + mTTSDuration * (sampleIndex - mTTSSampleIndex);
当前sample的时间戳就是前面所有的记录的时间之和mTTSSampleTime在加上,剩余的stts中一条记录中的几个sample的时间之和(如果推出while循环的时候,刚好满足了=的条件,这里就为0)
最后在加上composition time就是最后的时间戳了,composition time没有具体看,应当是ctts生成的一个表,专门用来记录到达每一个sample的时候,delta time是多少,即时间的延迟,ctts和stts表的生成和读取流程是一模一样的,可以对照学习,下面将代码贴出,并作简要分析:
case FOURCC('c', 't', 't', 's'):
{
status_t err =
mLastTrack->sampleTable->setCompositionTimeToSamp
leParams(
data_offset, chunk_data_size);
*offset += chunk_size;
break;
}
status_t SampleTable::setCompositionTimeToSamp
leParams(
off64_t data_offset, size_t data_size) {
ALOGI("There are reordered frames present.");
if (mCompositionTimeDeltaEnt
ries != NULL || data_size < 8) {
return ERROR_MALFORMED;
}
uint8_t header[8];
if (mDataSource->readAt(
data_offset, header, sizeof(header))
< (ssize_t)sizeof(header)) {
return ERROR_IO;
}
if (
U32_AT(header) != 0) {
// Expected version = 0, flags = 0.
return ERROR_MALFORMED;
}
size_t numEntries = U32_AT(&header[4]);
if (data_size != (numEntries + 1) * 8) {
return ERROR_MALFORMED;
}
mNumCompositionTimeDeltaEntries = numEntries;
mCompositionTimeDeltaEntries = new uint32_t[2 * numEntries];
if (mDataSource->
readAt(
data_offset + 8, mCompositionTimeDeltaEntries, numEntries * 8)
< (ssize_t)numEntries * 8) {
delete[] mCompositionTimeDeltaEnt
ries;
mCompositionTimeDeltaEnt
ries = NULL;
return ERROR_IO;
}
for (size_t i = 0; i < 2 * numEntries; ++i) {
mCompositionTimeDeltaEntries[i] = ntohl(mCompositionTimeDeltaEntries[i]);
}
mCompositionDeltaLookup->setEntries(
mCompositionTimeDeltaEntries, mNumCompositionTimeDeltaEntries);
return OK;
}
void SampleTable::CompositionDeltaLookup::setEntries(
const uint32_t *deltaEntries, size_t numDeltaEntries) {
Mutex::Autolock autolock(mLock);
mDeltaEntries = deltaEntries;
mNumDeltaEntries = numDeltaEntries;
mCurrentDeltaEntry = 0;
mCurrentEntrySampleIndex = 0;
}
任意sampleIndex,计算composition time的过程:
uint32_t SampleTable::getCompositionTimeOffset
(uint32_t sampleIndex) {
return mCompositionDeltaLookup->getCompositionTimeOffset
(sampleIndex);
}
uint32_t SampleTable::CompositionDeltaLookup::getCompositionTimeOffset
(
uint32_t sampleIndex) {
Mutex::Autolock autolock(mLock);
if (mDeltaEntries == NULL) {
return 0;
}
if (sampleIndex < mCurrentEntrySampleIndex
) {
mCurrentDeltaEntry = 0;
mCurrentEntrySampleIndex
= 0;
}
while (mCurrentDeltaEntry < mNumDeltaEntries) {
uint32_t sampleCount = mDeltaEntries[2 * mCurrentDeltaEntry];
记录的前四字节表示sample的个数
if (sampleIndex < mCurrentEntrySampleIndex
+ sampleCount) {
return m
DeltaEntries[2 * mCurrentDeltaEntry + 1];记录的后四个字节表示时间
}
mCurrentEntrySampleIndex += sampleCount; 数组内部每个记录对应的sampleIndex
++mCurrentDeltaEntry; m
DeltaEntries数组内部的游标,用来遍历所有的记录
}
return 0;