live555 学习笔记-H264VideoStreamParser详解_h264or5videostreamparser-CSDN博客

live555 学习笔记-H264VideoStreamParser详解

live555学习笔记17－H264VideoStreamParser详解
先设想一个问题：
H264VideoStreamFramer是什么角色？跟据H264VideoFileServerMediaSubsession的代码，H264VideoStreamFramer是真正代表source的，Sink所面对的Source就是它．但是它又连接了一个 ByteStreamFileSource．look一下这部分代码：

FramedSource* H264VideoFileServerMediaSubsession::createNewStreamSource(unsigned ,  
        unsigned& estBitrate)  
{  
    estBitrate = 500; // kbps, estimate  
  
    // Create the video source:  
    ByteStreamFileSource* fileSource = ByteStreamFileSource::createNew(envir(),  
            fFileName);  
    if (fileSource == NULL)  
        return NULL;  
    fFileSize = fileSource->fileSize();  
  
    // Create a framer for the Video Elementary Stream:  
    return H264VideoStreamFramer::createNew(envir(), fileSource);  
}

ByteStreamFileSource是从文件取得数据的，它不管是到底什么媒体格式，它只是读文件．所以很明显 H264VideoStreamFramer利用ByteStreamFileSource从文件取得数据，然后 H264VideoStreamFramer再对数据进行分析．比如找出每个NALU，然后传给Sink．但是 H264VideoStreamFramer没有自己去分析，而是利用了Parser，所以那一串中就多了一个 H264VideoStreamParser．
H264VideoStreamParser拥有两个source指针，一个是FramedSource* fInputSource,另一个是H264VideoStreamFramer* fUsingSource．可以看出，H264VideoStreamParser把fInputSource和fUsingSource串了起来，那么 fInputSource就是ByteStreamFileSource．

我们想像一下H264VideoStreamParser的所作所为：H264VideoStreamFramer把自己的缓冲（其实是sink的）传给 H264VideoStreamParser,每当H264VideoStreamFramer要获取一个NALU时，就跟 H264VideoStreamParser要，H264VideoStreamParser就从ByteStreamFileSource读一坨数据，然后进行分析，如果取得了一个NALU，就传给H264VideoStreamFramer.
实际流程：

//Sink调用Source(H264VideoStreamFramer)的GetNextFrame()获取数据，  
//H264VideoStreamFramer从MPEGVideoStreamFramer派生，所以下面的函数会被调用：  
void MPEGVideoStreamFramer::doGetNextFrame()  
{  
    fParser->registerReadInterest(fTo, fMaxSize);  
    continueReadProcessing();  
}  
  
void MPEGVideoStreamFramer::continueReadProcessing(void* clientData,  
        unsigned char* ,  
        unsigned ,  
        struct timeval )  
{  
    MPEGVideoStreamFramer* framer = (MPEGVideoStreamFramer*) clientData;  
    framer->continueReadProcessing();  
}

void MPEGVideoStreamFramer::continueReadProcessing()  
{  
    //调用Parser的parser()分析出一个NALU.如果得到了一个NALU,则  
    //用afterGetting(this)返回给Sink.  
    unsigned acquiredFrameSize = fParser->parse();  
    if (acquiredFrameSize > 0)  
    {  
        // We were able to acquire a frame from the input.  
        // It has already been copied to the reader's space.  
        fFrameSize = acquiredFrameSize;  
        fNumTruncatedBytes = fParser->numTruncatedBytes();  
  
        // "fPresentationTime" should have already been computed.  
  
        // Compute "fDurationInMicroseconds" now:  
        fDurationInMicroseconds =  
                (fFrameRate == 0.0 || ((int) fPictureCount) < 0) ?  
                        0 : (unsigned) ((fPictureCount * 1000000) / fFrameRate);  
        fPictureCount = 0;  
  
        // Call our own 'after getting' function.  Because we're not a 'leaf'  
        // source, we can call this directly, without risking infinite recursion.  
        afterGetting(this);  
    }  
    else  
    {  
        //执行到此处并不代表parser()中没有取得数据!!  
        // We were unable to parse a complete frame from the input, because:  
        // - we had to read more data from the source stream, or  
        // - the source stream has ended.  
    }  
}

parser()中读新数据是由那些test4Bytes(),skipBytes()之类的函数引起的,它们都最终调用了ensureValidBytes1():

//numBytesNeeded 需要读取的数据长度
//fCurParserIndex 当前读取的位置
//
void StreamParser::ensureValidBytes1(unsigned numBytesNeeded)  
{  
    // We need to read some more bytes from the input source.  
    // First, clarify how much data to ask for:  
    unsigned maxInputFrameSize = fInputSource->maxFrameSize();  
    if (maxInputFrameSize > numBytesNeeded)  
        numBytesNeeded = maxInputFrameSize;  
  
    // First, check whether these new bytes would overflow the current  
    // bank.  If so, start using a new bank now.  
    //当前读取的数据跨越bank时，需要把读取的整块数据搬移到下一个bank中
    if (fCurParserIndex + numBytesNeeded > BANK_SIZE)  
    {  
        // Swap banks, but save any still-needed bytes from the old bank:  
        unsigned numBytesToSave = fTotNumValidBytes - fSavedParserIndex;  
        unsigned char const* from = &curBank()[fSavedParserIndex];  
  
        fCurBankNum = (fCurBankNum + 1) % 2;  
        fCurBank = fBank[fCurBankNum];  
        memmove(curBank(), from, numBytesToSave);  
        fCurParserIndex = fCurParserIndex - fSavedParserIndex;  
        fSavedParserIndex = 0;  
        fTotNumValidBytes = numBytesToSave;  
    }  
  
    // ASSERT: fCurParserIndex + numBytesNeeded > fTotNumValidBytes  
    //      && fCurParserIndex + numBytesNeeded <= BANK_SIZE  
    if (fCurParserIndex + numBytesNeeded > BANK_SIZE)  
    {  
        // If this happens, it means that we have too much saved parser state.  
        // To fix this, increase BANK_SIZE as appropriate.  
        fInputSource->envir() << "StreamParser internal error ("  
                << fCurParserIndex << "+ " << numBytesNeeded << " > "  
                << BANK_SIZE << ")\n";  
        fInputSource->envir().internalError();  
    }  
  
    // Try to read as many new bytes as will fit in the current bank:  
    unsigned maxNumBytesToRead = BANK_SIZE - fTotNumValidBytes; 
    //从ByteStreamFileSource中读取一块数据
    fInputSource->getNextFrame(&curBank()[fTotNumValidBytes], maxNumBytesToRead,  
            afterGettingBytes, this, onInputClosure, this);  
  
    throw NO_MORE_BUFFERED_INPUT;  
}

可以看到一个奇怪的现象:这个函数没有返回值,但最终抛出了一个异常,而且只要执行这个函数,就会抛出这个异常.
还是先分析一下这个函数做了什么吧:
首先判断自己的缓冲区是否能容纳所需的数据量,如果实在不能,也只能提示一下,最后从ByteStreamFileSource获取一坨数据.curBack()返回的就是Parser自己的缓冲.而afterGettingBytes这个回调函数是 H264VideoStreamFramer传入的,所以获取数据之后会执行H264VideoStreamFramer的函数,中转几下后,最终执行的就是上面的void MPEGVideoStreamFramer::continueReadProcessing().哇,看到了一个问题:Parser()中嵌套执行 Parser()!而第二次执行Parser()完成后,返回到ensureValidBytes1(),然后由于抛出异常而退出,退出到哪里了呢?退回到上次调用的Parser()中了,因为Parser()中写了try{}catch{}.catch{}中的代码如下:

   catch (int )  
    {  
#ifdef DEBUG  
        fprintf(stderr, "H264VideoStreamParser::parse() EXCEPTION (This is normal behavior - *not* an error)\n");  
#endif  
        return 0; // the parsing got interrupted  
    }

unsigned H264or5VideoStreamParser::parse() {
  try {
    // The stream must start with a 0x00000001: 
    //H264 NALU start code
    if (!fHaveSeenFirstStartCode) {
      // Skip over any input bytes that precede the first 0x00000001:
      u_int32_t first4Bytes;
      while ((first4Bytes = test4Bytes()) != 0x00000001) {
      //查找NALU头
	get1Byte(); setParseState(); // ensures that we progress over bad data
      }
      skipBytes(4); // skip this initial code
      
      setParseState();
      fHaveSeenFirstStartCode = True; // from now on
    }
    
    if (fOutputStartCodeSize > 0 && curFrameSize() == 0 && !haveSeenEOF()) {
      // Include a start code in the output:
      //保存Nalu头
      save4Bytes(0x00000001);
    }

    // Then save everything up until the next 0x00000001 (4 bytes) or 0x000001 (3 bytes), or we hit EOF.
    // Also make note of the first byte, because it contains the "nal_unit_type": 
    if (haveSeenEOF()) {
      // We hit EOF the last time that we tried to parse this data, so we know that any remaining unparsed data
      // forms a complete NAL unit, and that there's no 'start code' at the end:
      unsigned remainingDataSize = totNumValidBytes() - curOffset();
#ifdef DEBUG
      unsigned const trailingNALUnitSize = remainingDataSize;
#endif
      while (remainingDataSize > 0) {
	u_int8_t nextByte = get1Byte();
	if (!fHaveSeenFirstByteOfNALUnit) {
	  fFirstByteOfNALUnit = nextByte;
	  fHaveSeenFirstByteOfNALUnit = True;
	}
	saveByte(nextByte);
	--remainingDataSize;
      }

#ifdef DEBUG
      if (fHNumber == 264) {
	u_int8_t nal_ref_idc = (fFirstByteOfNALUnit&0x60)>>5;
	u_int8_t nal_unit_type = fFirstByteOfNALUnit&0x1F;
	fprintf(stderr, "Parsed trailing %d-byte NAL-unit (nal_ref_idc: %d, nal_unit_type: %d (\"%s\"))\n",
		trailingNALUnitSize, nal_ref_idc, nal_unit_type, nal_unit_type_description_h264[nal_unit_type]);
      } else { // 265
	u_int8_t nal_unit_type = (fFirstByteOfNALUnit&0x7E)>>1;
	fprintf(stderr, "Parsed trailing %d-byte NAL-unit (nal_unit_type: %d (\"%s\"))\n",
		trailingNALUnitSize, nal_unit_type, nal_unit_type_description_h265[nal_unit_type]);
      }
#endif

      (void)get1Byte(); // forces another read, which will cause EOF to get handled for real this time
      return 0;
    } else {
      u_int32_t next4Bytes = test4Bytes();
      if (!fHaveSeenFirstByteOfNALUnit) {
      //读取nalu头部后的第一个字节
	fFirstByteOfNALUnit = next4Bytes>>24;
	fHaveSeenFirstByteOfNALUnit = True;
      }
      while (next4Bytes != 0x00000001 && (next4Bytes&0xFFFFFF00) != 0x00000100) {
	// We save at least some of "next4Bytes".
	if ((unsigned)(next4Bytes&0xFF) > 1) {
	  // Common case: 0x00000001 or 0x000001 definitely doesn't begin anywhere in "next4Bytes", so we save all of it:
	  save4Bytes(next4Bytes);
	  skipBytes(4);
	} else {
	  // Save the first byte, and continue testing the rest:
	  saveByte(next4Bytes>>24);
	  skipBytes(1);
	}
	setParseState(); // ensures forward progress
	next4Bytes = test4Bytes();
      }
      // Assert: next4Bytes starts with 0x00000001 or 0x000001, and we've saved all previous bytes (forming a complete NAL unit).
      // Skip over these remaining bytes, up until the start of the next NAL unit:
      if (next4Bytes == 0x00000001) {
	skipBytes(4);
      } else {
	skipBytes(3);
      }
    }

    fHaveSeenFirstByteOfNALUnit = False; // for the next NAL unit that we'll parse
    u_int8_t nal_unit_type;
    if (fHNumber == 264) {
      nal_unit_type = fFirstByteOfNALUnit&0x1F;
#ifdef DEBUG
      u_int8_t nal_ref_idc = (fFirstByteOfNALUnit&0x60)>>5;
      fprintf(stderr, "Parsed %d-byte NAL-unit (nal_ref_idc: %d, nal_unit_type: %d (\"%s\"))\n",
	      curFrameSize()-fOutputStartCodeSize, nal_ref_idc, nal_unit_type, nal_unit_type_description_h264[nal_unit_type]);
#endif
    } else { // 265
      nal_unit_type = (fFirstByteOfNALUnit&0x7E)>>1;
#ifdef DEBUG
      fprintf(stderr, "Parsed %d-byte NAL-unit (nal_unit_type: %d (\"%s\"))\n",
	      curFrameSize()-fOutputStartCodeSize, nal_unit_type, nal_unit_type_description_h265[nal_unit_type]);
#endif
    }

    // Now that we have found (& copied) a NAL unit, process it if it's of special interest to us:
    if (isVPS(nal_unit_type)) { // Video parameter set
      // First, save a copy of this NAL unit, in case the downstream object wants to see it:
      usingSource()->saveCopyOfVPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);

      if (fParsedFrameRate == 0.0) {
	// We haven't yet parsed a frame rate from the stream.
	// So parse this NAL unit to check whether frame rate information is present:
	unsigned num_units_in_tick, time_scale;
	analyze_video_parameter_set_data(num_units_in_tick, time_scale);
	if (time_scale > 0 && num_units_in_tick > 0) {
	  usingSource()->fFrameRate = fParsedFrameRate
	    = time_scale/(DeltaTfiDivisor*num_units_in_tick);
#ifdef DEBUG
	  fprintf(stderr, "Set frame rate to %f fps\n", usingSource()->fFrameRate);
#endif
	} else {
#ifdef DEBUG
	  fprintf(stderr, "\tThis \"Video Parameter Set\" NAL unit contained no frame rate information, so we use a default frame rate of %f fps\n", usingSource()->fFrameRate);
#endif
	}
      }
    } else if (isSPS(nal_unit_type)) { // Sequence parameter set
      // First, save a copy of this NAL unit, in case the downstream object wants to see it:
      usingSource()->saveCopyOfSPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);

      if (fParsedFrameRate == 0.0) {
	// We haven't yet parsed a frame rate from the stream.
	// So parse this NAL unit to check whether frame rate information is present:
	unsigned num_units_in_tick, time_scale;
	analyze_seq_parameter_set_data(num_units_in_tick, time_scale);
	if (time_scale > 0 && num_units_in_tick > 0) {
	  usingSource()->fFrameRate = fParsedFrameRate
	    = time_scale/(DeltaTfiDivisor*num_units_in_tick);
#ifdef DEBUG
	  fprintf(stderr, "Set frame rate to %f fps\n", usingSource()->fFrameRate);
#endif
	} else {
#ifdef DEBUG
	  fprintf(stderr, "\tThis \"Sequence Parameter Set\" NAL unit contained no frame rate information, so we use a default frame rate of %f fps\n", usingSource()->fFrameRate);
#endif
	}
      }
    } else if (isPPS(nal_unit_type)) { // Picture parameter set
      // Save a copy of this NAL unit, in case the downstream object wants to see it:
      usingSource()->saveCopyOfPPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);
    } else if (isSEI(nal_unit_type)) { // Supplemental enhancement information (SEI)
      analyze_sei_data(nal_unit_type);
      // Later, perhaps adjust "fPresentationTime" if we saw a "pic_timing" SEI payload??? #####
    }

    usingSource()->setPresentationTime();
#ifdef DEBUG
    unsigned long secs = (unsigned long)usingSource()->fPresentationTime.tv_sec;
    unsigned uSecs = (unsigned)usingSource()->fPresentationTime.tv_usec;
    fprintf(stderr, "\tPresentation time: %lu.%06u\n", secs, uSecs);
#endif

    // Now, check whether this NAL unit ends an 'access unit'.
    // (RTP streamers need to know this in order to figure out whether or not to set the "M" bit.)
    Boolean thisNALUnitEndsAccessUnit;
    if (haveSeenEOF() || isEOF(nal_unit_type)) {
      // There is no next NAL unit, so we assume that this one ends the current 'access unit':
      thisNALUnitEndsAccessUnit = True;
    } else if (usuallyBeginsAccessUnit(nal_unit_type)) {
      // These NAL units usually *begin* an access unit, so assume that they don't end one here:
      thisNALUnitEndsAccessUnit = False;
    } else {
      // We need to check the *next* NAL unit to figure out whether
      // the current NAL unit ends an 'access unit':
      u_int8_t firstBytesOfNextNALUnit[3];
      testBytes(firstBytesOfNextNALUnit, 3);

      u_int8_t const& next_nal_unit_type = fHNumber == 264
	? (firstBytesOfNextNALUnit[0]&0x1F) : ((firstBytesOfNextNALUnit[0]&0x7E)>>1);
      if (isVCL(next_nal_unit_type)) {
	// The high-order bit of the byte after the "nal_unit_header" tells us whether it's
	// the start of a new 'access unit' (and thus the current NAL unit ends an 'access unit'):
	u_int8_t const byteAfter_nal_unit_header
	  = fHNumber == 264 ? firstBytesOfNextNALUnit[1] : firstBytesOfNextNALUnit[2];
	thisNALUnitEndsAccessUnit = (byteAfter_nal_unit_header&0x80) != 0;
      } else if (usuallyBeginsAccessUnit(next_nal_unit_type)) {
	// The next NAL unit's type is one that usually appears at the start of an 'access unit',
	// so we assume that the current NAL unit ends an 'access unit':
	thisNALUnitEndsAccessUnit = True;
      } else {
	// The next NAL unit definitely doesn't start a new 'access unit',
	// which means that the current NAL unit doesn't end one:
	thisNALUnitEndsAccessUnit = False;
      }
    }
	
    if (thisNALUnitEndsAccessUnit) {
#ifdef DEBUG
      fprintf(stderr, "*****This NAL unit ends the current access unit*****\n");
#endif
      usingSource()->fPictureEndMarker = True;
      ++usingSource()->fPictureCount;

      // Note that the presentation time for the next NAL unit will be different:
      struct timeval& nextPT = usingSource()->fNextPresentationTime; // alias
      nextPT = usingSource()->fPresentationTime;
      double nextFraction = nextPT.tv_usec/1000000.0 + 1/usingSource()->fFrameRate;
      unsigned nextSecsIncrement = (long)nextFraction;
      nextPT.tv_sec += (long)nextSecsIncrement;
      nextPT.tv_usec = (long)((nextFraction - nextSecsIncrement)*1000000);
    }
    setParseState();

    return curFrameSize();
  } catch (int /*e*/) {
#ifdef DEBUG
    fprintf(stderr, "H264or5VideoStreamParser::parse() EXCEPTION (This is normal behavior - *not* an error)\n");
#endif
    return 0;  // the parsing got interrupted
  }
}

sink要获取数据,执行到 MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing 调用parser(),parser()要使用数据时发现没有,于是ensureValidBytes1()被调用来从 ByteStreamFileSource获取数据,取得数据后MPEGVideoStreamFramer::afterGettingBytes() 被调用,并中转到 MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing() 被嵌套调用!,MPEGVideoStreamFramer::continueReadProcessing()中又会调用parser(),此时 parser()要使用数据时发现有数据了,所以就进行分析,分析出一个NALU后,返回到 MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing() 会调用afterGetting(this)把数据返回给sink.sink处理完数据后返回到 MPEGVideoStreamFramer::continueReadProcessing(),MPEGVideoStreamFramer::continueReadProcessing() 再返回到ensureValidBytes1(),ensureValidBytes1()抛出异常返回到第一次被调用的parser()的 catch{}中,parser()返回到第一次调用的 MPEGVideoStreamFramer::continueReadProcessing() 中,MPEGVideoStreamFramer::continueReadProcessing()发现parser()没有取得NALU,于是啥也不做,返回到sink中,sink会继续通过 source->getNextFrame()->MPEGVideoStreamFramer::continueReadProcessing()… 这样再次获取NALU.

好曲折离奇的故事!不过终于讲完了!

可以看到,parser中是有自己的缓冲的,而且其大小是固定的:
#define BANK_SIZE 150000
你自己写Source时,每次输出的是一帧数据,包含多个NALU,所以你只要确定你的一帧不超过150000字节,你就可以放心的往fTo中copy,如果你的帧太大,就改这个宏吧