说明
live555经过了20多年的发展,使用的是老版的C++风格,大量使用了回调函数,导致代码有一点难以理解,如果没有足够的耐心和清晰的头脑,可能会被继承导致的各种回调逼疯,读着读着都不知道代码走到哪里了。
首先说明一下,关于live555代码的分析都是基于2022.4.26版本上的说明,因此可能与你们看的代码有点差别,总体来说基本流程一致,如果想与我的代码一致,可以参考我的文章windows vs2019编译最新版本的live555版本,分析使用的客户端是testRTSPClient,服务端使用的testOnDemandRTSPServer,如果对服务器与客户端通信流程不熟悉的同学,可以参考我的博客:RTSP客户端与服务器端通信流程分析.,另外为了节约篇幅,直贴出关键代码,另外以h264测试视频来进行说明。
涉及类和对象
为了后续讲解流程的方便,我们先说几个为了获取H264 SDP描述相关的类的作用
ServerMediaSession
服务端用来表示的媒体会话对象, 对于文件而言,通常与文件一一对应,服务器的一个文件通常会创建一个ServerMediaSession对象与之相关联。
H264VideoFileServerMediaSubsession
H264VideoFileServerMediaSubSession用于描述H264视频文件的媒体子会话对象,用来描述H264的流信息,继承于ServerMediaSubSession,通常情况下,服务器对象有多个ServerMediaSession对象,一个ServerMediaSession又包括多个ServerMediaSubSession对象。服务器端初始化如下:
char const* streamName = "h264ESVideoTest";//流名字,媒体名
char const* inputFileName = "test.h264";//文件名,当客户端输入的流名字为h264ESVideoTest时,实际上打开的是test.264文件
// 4.创建媒体会话
//当客户点播时,要输入流名字streamName,告诉RTSP服务器点播的是哪个流。
//流名字和文件名的对应关系是通过增加子会话建立起来的(流名字streamName不是文件名inputFileName)。媒体会话对会话描述、会话持续时间、流名字等与会话有关的信息进行管理
//第二个参数:媒体名、三:媒体信息、四:媒体描述
ServerMediaSession* sms
= ServerMediaSession::createNew(*env, streamName, streamName,
descriptionString);
//5.添加264子会话 这里的文件名才是真正打开文件的名字
//reuseFirstSource:
//这里的HH264VideoFileServerMediaSubsession类派生自FileServerMediaSubsession派生自OnDemandServerMediaSubsession
//而OnDemandServerMediaSubsession和PassiveMediaSubsession共同派生自ServerMediaSubsession
//关于读取文件之类都在这个类中实现的,如果要将点播改为直播就是要新建类继承此类然后添加新的方法
sms->addSubsession(H264VideoFileServerMediaSubsession
::createNew(*env, inputFileName, reuseFirstSource));
rtspServer->addServerMediaSession(sms); //6.为rtspserver添加session
ByteStreamFileSource
该类对象用于从文件中获取数据,继承于FramedSource,实现了doGetNextFrame方法,该函数的主要作用是通过上层Source调用这个方法,从文件中读取一定的数据,上层source使用完这次获取的数据后,如果需要更多的数据,则再次调用这个方法,直到无法获取到更多数据为止。
H264or5VideoStreamParser
H264or5VideoStreamParser继承于MPEGVideoStreamParser类,parse()接口会每次从H264视频数据中解析出一帧数据,如果在解析的过程中,没有足够的数据,则通过调用ByteStreamFileSource的getNextFrame获取到更多的数据。
H264VideoStreamFramer
H264VideoStreamFramer主要用于获取H264视频帧文件,H264VideoStreamFramer类继承H264or5VideoStreamFramer类,H264or5VideoStreamFramer又继承于MPEGVideoStreamFramer,MPEGVideoStreamFramer类具有解析器(MPEGVideoStreamParser)的引用。H264VideoStreamFramer类对象使用FramedSource::getNextFrame接口获取一帧数据,然后调用MPEGVideoStreamFramer::doGetNextFrame(),然后调用MPEGVideoStreamFramer::continueReadProcessing(),在该函数里会调用parser->parse(),parser在解析过程中会调用ByteStreamFileSource的getNextFrame,从而从文件中获取到一帧数据。
H264or5Fragmenter
使用H264VideoStreamFramer类对象获取到一帧数据之后,需要通过网络将数据从服务器传输给客户端,但是网络每次传输的最大值(MTU)有限制,而使用 H264VideoStreamFramer获取到一帧数据有可能会大于MTU的限制,H264or5Fragmenter的作用之一就是将H264VideoStreamFramer获取到的数据进行分片,然后再在网络中进行传输。
H264VideoRTPSink
用于H264使用RTP协议传输数据时使用,通常结合H264VideoRTPSource一起使用, 2者都包含RTPInterface对象,前者使用RTPInterface对象传输数据,后者使用RTPInterface对象接收数据。
H264VideoRTPSink继承于MediaSink,MediaSink::startPlaying播放会传递source对象(为H264VideoStreamFramer对象),最终会调用H264or5VideoRTPSink::continuePlaying()将source对象设置为H264or5Fragmenter对象,通过H264or5Fragmenter对象将获取到的 数据和RTP一起打包,然后通过H264VideoRTPSink里的RTPInterface对象传输给客户端。
介绍完重要的类之后,在来说说客户端发送DESCRIBE请求到接收到DESCRIEB响应的响应流程。
客户端发送DESCRIEB请求
在客户端主程序里,在openURL函数里,客户端开始发送RTSP DESCRIEB请求,代码如下
void openURL(UsageEnvironment& env, char const* progName, char const* rtspURL) {
.....
rtspClient->sendDescribeCommand(continueAfterDESCRIBE);
}
在该函数使用客户端的访问地址和程序名创建RTSPClient对象,并发起DESCRIBE请求,服务器端接收到请求后,服务器端生成DESCRIBE响应并返回给客户端,客户端解析服务器返回的响应,并最终执行continueAfterDESCRIBE回调。
sendDescirbeCommand函数里调用sendRequest函数发送各种不同的请求。sendRequest的主要工作如下:
1.如果在这次请求之前有其他请求需要处理,并且之前的请求还没有连接到服务器,将这个请求加入到等待连接队列中。
2.如果本次请求是第一次连接到服务器的请求(socketnum < 0),则连接到服务器
3.使用setRequestFields函数和来构造头字段
4.发送构造好的请求给服务器
以下是DESCRIBE请求字符串
Sending request: DESCRIBE rtsp://192.168.56.1:8554/h264ESVideoTest RTSP/1.0
CSeq: 2
User-Agent: testRTSPClient (LIVE555 Streaming Media v2022.04.26)
Accept: application/sdp
服务器生成响应(主要是SDP)流程
服务器端使用RTSPClientConnection::handleRequestBytes(int newBytesRead)函数来处理客户端发送的请求,该函数的主要逻辑如下
1.判断是否发生了网络错误或者接收缓冲区是否足够大,如果网络错误或者接收缓冲区不够,直接关闭该连接
if (newBytesRead < 0 || (unsigned)newBytesRead >= fRequestBufferBytesLeft) {
// Either the client socket has died, or the request was too big for us. 客户端socket已经关闭,获取请求的字节过多,直接终止该connetcion
// Terminate this connection:
#ifdef DEBUG
fprintf(stderr, "RTSPClientConnection[%p]::handleRequestBytes() read %d new bytes (of %d); terminating connection!\n", this, newBytesRead, fRequestBufferBytesLeft);
#endif
fIsActive = False;
break;
}
2. 如果第一步通过,则判断请求是否完整(头部内容是否完整发送),判断的依据是读取到\r\n\r\n,前一个\r\n为最后一行头的结束,后一个\r\n为一个空行。
if (fBase64RemainderCount == 0) { // no more Base-64 bytes remain to be read/decoded
// Look for the end of the message: <CR><LF><CR><LF>
if (tmpPtr < fRequestBuffer) tmpPtr = fRequestBuffer;
while (tmpPtr < &ptr[newBytesRead - 1]) {
if (*tmpPtr == '\r' && *(tmpPtr + 1) == '\n') {
// This is it:当另一行的位置的上一次检测头的\r\n只相差2个位置(\r\n)时,代表tempPtr这行为空行,即内容头结束
if (tmpPtr - fLastCRLF == 2) {
endOfMsg = True;
break;
}
fLastCRLF = tmpPtr;
}
++tmpPtr;
}
}
3.parseRTSPRequestString接口解析请求头,主要解析出URL,cseq,sessionid,contentlenth等
4.第三部完成之后,服务器端根据命令名知道发送了DESCRIBE请求,并使用handleCmd_DESCRIBE函数处理DESRCIBE请求
//拿以上DESCRIBE请求为例 urlPreSuffix = "", urlSuffix = "h264ESVideoTest"
void RTSPServer::RTSPClientConnection
::handleCmd_DESCRIBE(char const* urlPreSuffix, char const* urlSuffix, char const* fullRequestStr) {
char urlTotalSuffix[2 * RTSP_PARAM_STRING_MAX];
// enough space for urlPreSuffix/urlSuffix'\0'
urlTotalSuffix[0] = '\0';
if (urlPreSuffix[0] != '\0') {
strcat(urlTotalSuffix, urlPreSuffix);
strcat(urlTotalSuffix, "/");
}
strcat(urlTotalSuffix, urlSuffix);
//需要权限验证,权限验证没有通过,返回
if (!authenticationOK("DESCRIBE", urlTotalSuffix, fullRequestStr)) return;
// We should really check that the request contains an "Accept:" #####
// for "application/sdp", because that's what we're sending back #####
// Begin by looking up the "ServerMediaSession" object for the specified "urlTotalSuffix":
fOurServer.lookupServerMediaSession(urlTotalSuffix, DESCRIBELookupCompletionFunction, this);
}
5.根据urlTotalSuffix(流对象名称)找到对象的ServerMediaSession对象,并执行以下函数用于获取sdp,并生成DESCRIBE响应
void RTSPServer::RTSPClientConnection
::DESCRIBELookupCompletionFunction(void* clientData, ServerMediaSession* sessionLookedUp) {
RTSPServer::RTSPClientConnection* connection = (RTSPServer::RTSPClientConnection*)clientData;
connection->handleCmd_DESCRIBE_afterLookup(sessionLookedUp);
}
void RTSPServer::RTSPClientConnection
::handleCmd_DESCRIBE_afterLookup(ServerMediaSession* session) {
char* sdpDescription = NULL;
char* rtspURL = NULL;
do {
if (session == NULL) {
handleCmd_notFound();
break;
}
// Increment the "ServerMediaSession" object's reference count, in case someone removes it
// while we're using it: 增加引用,以防被删除
session->incrementReferenceCount();
// Then, assemble a SDP description for this session:
sdpDescription = session->generateSDPDescription(fAddressFamily);
if (sdpDescription == NULL) {
// This usually means that a file name that was specified for a
// "ServerMediaSubsession" does not exist.
setRTSPResponse("404 File Not Found, Or In Incorrect Format");
break;
}
unsigned sdpDescriptionSize = strlen(sdpDescription);
// Also, generate our RTSP URL, for the "Content-Base:" header
// (which is necessary to ensure that the correct URL gets used in subsequent "SETUP" requests).
rtspURL = fOurRTSPServer.rtspURL(session, fClientInputSocket);
snprintf((char*)fResponseBuffer, sizeof fResponseBuffer,
"RTSP/1.0 200 OK\r\nCSeq: %s\r\n"
"%s"
"Content-Base: %s/\r\n"
"Content-Type: application/sdp\r\n"
"Content-Length: %d\r\n\r\n"
"%s",
fCurrentCSeq,
dateHeader(),
rtspURL,
sdpDescriptionSize,
sdpDescription);
} while (0);
if (session != NULL) {
// Decrement its reference count, now that we're done using it:
session->decrementReferenceCount();
if (session->referenceCount() == 0 && session->deleteWhenUnreferenced()) {
fOurServer.removeServerMediaSession(session);
}
}
delete[] sdpDescription;
delete[] rtspURL;
}
6.以上代码没啥难理解的,主要是使用generateSDPDescription函数产生sdp描述的过程,主要包括session级sdp和media级sdp,media级sdp使用sdpLines来得到,subsession对象是H264VideoFileServerMediaSubsession对象
char* ServerMediaSession::generateSDPDescription(int addressFamily) {
......
do {
......
// Then, add the (media-level) lines for each subsession:
char* mediaSDP = sdp;
for (subsession = fSubsessionsHead; subsession != NULL;
subsession = subsession->fNext) {
unsigned mediaSDPLength = strlen(mediaSDP);
mediaSDP += mediaSDPLength;
sdpLength -= mediaSDPLength;
if (sdpLength <= 1) break; // the SDP has somehow become too long
//sdpLines函数在onDemandServerMediaSubsession(用于单播)
//PassiveServerMediaSubsession(用于多播)都有实现,这里调用的前者的sdpLines
char const* sdpLines = subsession->sdpLines(addressFamily);
if (sdpLines != NULL) snprintf(mediaSDP, sdpLength, "%s", sdpLines);
}
} while (0);
}
6.1.看看sdpLines是如何构造media级的sdp的
char const*
OnDemandServerMediaSubsession::sdpLines(int addressFamily) {
if (fSDPLines == NULL) {
// We need to construct a set of SDP lines that describe this
// subsession (as a unicast stream). To do so, we first create
// dummy (unused) source and "RTPSink" objects,
// whose parameters we use for the SDP lines:
unsigned estBitrate;
//以H264媒体文件为例,调用的是H264VideoFileServerMediaSubsession::createNewStreamSource
//返回的是H264VideoStreamFramer对象,该对象包括ByteStreamFileSource对象,
// ByteStreamFileSource对象用于从文件中获取数据
FramedSource* inputSource = createNewStreamSource(0, estBitrate);
if (inputSource == NULL) return NULL; // file not found
Groupsock* dummyGroupsock = createGroupsock(nullAddress(addressFamily), 0);
unsigned char rtpPayloadType = 96 + trackNumber() - 1; // if dynamic
//对于H264来说,调用为H264VideoFileServerMediaSubsession::createNewRTPSink
//返回为H264VideoRTPSink对象
RTPSink* dummyRTPSink = createNewRTPSink(dummyGroupsock, rtpPayloadType, inputSource);
if (dummyRTPSink != NULL) {
if (fParentSession->streamingUsesSRTP) {
fMIKEYStateMessage = dummyRTPSink->setupForSRTP(fParentSession->streamingIsEncrypted,
fMIKEYStateMessageSize);
}
if (dummyRTPSink->estimatedBitrate() > 0) estBitrate = dummyRTPSink->estimatedBitrate();
//产生媒体级sdp描述
setSDPLinesFromRTPSink(dummyRTPSink, inputSource, estBitrate);
Medium::close(dummyRTPSink);
}
delete dummyGroupsock;
closeStreamSource(inputSource);
}
return fSDPLines;
}
在第6.1步流程中,inputSource为H264VideoStreamFramer对象,该对象包含一个用于从底层获取数据的ByteStreamFileSource对象,同时使用createNewRTPSink创建了一个H264VideoRTPSink对象dummyRTPSink,把dummyRTPSink, inputSource对象传入到setSDPLinesFromRTPSink函数用于获取媒体级sdp。
6.2.使用setSDPLinesFromRTPSink构建媒体sdp
void OnDemandServerMediaSubsession
::setSDPLinesFromRTPSink(RTPSink* rtpSink, FramedSource* inputSource, unsigned estBitrate) {
......
char const* auxSDPLine = getAuxSDPLine(rtpSink, inputSource);
......
}
在第6.2步中最重要的是如何获取每种媒体的特殊sdp描述,getAuxSDPLine调用的是以下函数
char const* H264VideoFileServerMediaSubsession::getAuxSDPLine(RTPSink* rtpSink, FramedSource* inputSource) {
//这里不为空的情况,可能是多个客户端都请求了同一个服务器的该资源,因此直接返回前面获取到的sdp描述
if (fAuxSDPLine != NULL) return fAuxSDPLine; // it's already been set up (for a previous client)
if (fDummyRTPSink == NULL) { // we're not already setting it up for another, concurrent stream
// Note: For H264 video files, the 'config' information ("profile-level-id" and "sprop-parameter-sets") isn't known
// until we start reading the file. This means that "rtpSink"s "auxSDPLine()" will be NULL initially,
// and we need to start reading data from our file until this changes.
fDummyRTPSink = rtpSink;
// Start reading the file: 通过读取文件来获取h264 sdp描述
fDummyRTPSink->startPlaying(*inputSource, afterPlayingDummy, this);
// Check whether the sink's 'auxSDPLine()' is ready:
//这个函数会递归调用,直到获取到sdp为止
checkForAuxSDPLine(this);
}
//直到读取到sdp为止,如果fDoneFlag被设置为真,则会退出该事件循环
envir().taskScheduler().doEventLoop(&fDoneFlag);
return fAuxSDPLine;
}
6.3. 使用startPlaying获取sdp描述
Boolean MediaSink::startPlaying(MediaSource& source,
afterPlayingFunc* afterFunc,
void* afterClientData) {
// Make sure we're not already being played:
if (fSource != NULL) {
envir().setResultMsg("This sink is already being played");
return False;
}
// Make sure our source is compatible:
if (!sourceIsCompatibleWithUs(source)) {
envir().setResultMsg("MediaSink::startPlaying(): source is not compatible!");
return False;
}
//给sink的数据源赋值,对于客户端而言,该source负责从服务器端获取数据
//对于服务器而言,一般是用于读取数据的source
fSource = (FramedSource*)&source;
fAfterFunc = afterFunc;
fAfterClientData = afterClientData;
return continuePlaying();//不同子类有不同实现,但是相同的是需要不断的获取数据并消费数据
}
startPlaying做的工作很简单,赋值source(这里为前面创建的H264VideoStreamFramer对象),并调用continuePlaying继续播放
Boolean H264or5VideoRTPSink::continuePlaying() {
// First, check whether we have a 'fragmenter' class set up yet.
// If not, create it now:
//第一次会创建,因为在创建H264orVideoRTPSink没有赋值,fOurFragmenter需要依赖fSource对象。
if (fOurFragmenter == NULL) {
fOurFragmenter = new H264or5Fragmenter(fHNumber, envir(), fSource, OutPacketBuffer::maxSize,
ourMaxPacketSize() - 12/*RTP hdr size*/);
} else {
fOurFragmenter->reassignInputSource(fSource);
}
fSource = fOurFragmenter;
// Then call the parent class's implementation:
return MultiFramedRTPSink::continuePlaying();
}
contiunePlaying的工作主要是为创建fOurFramenter(过滤器:分片对象),并将fSource对象重置为fOurFramenter对象。
6.4.MultiFramedRTPSink::continuePlaying()调用buildAndSendPacket构造包和发送包
void MultiFramedRTPSink::buildAndSendPacket(Boolean isFirstPacket) {
nextTask() = NULL;
fIsFirstPacket = isFirstPacket;
// Set up the RTP header: 构造RTP头,如果对RTP协议不熟悉的同学,请自行百度
unsigned rtpHdr = 0x80000000; // RTP version 2; marker ('M') bit not set (by default; it can be set later)
rtpHdr |= (fRTPPayloadType << 16);
rtpHdr |= fSeqNo; // sequence number
fOutBuf->enqueueWord(rtpHdr);
// Note where the RTP timestamp will go.
// (We can't fill this in until we start packing payload frames.)
fTimestampPosition = fOutBuf->curPacketSize();
fOutBuf->skipBytes(4); // leave a hole for the timestamp
fOutBuf->enqueueWord(SSRC());
// Allow for a special, payload-format-specific header following the
// RTP header:
fSpecialHeaderPosition = fOutBuf->curPacketSize();
fSpecialHeaderSize = specialHeaderSize();
fOutBuf->skipBytes(fSpecialHeaderSize);
// Begin packing as many (complete) frames into the packet as we can:
fTotalFrameSpecificHeaderSizes = 0;
fNoFramesLeft = False;
fNumFramesUsedSoFar = 0;
packFrame();
}
使用packFrame()构造h264负载数据
void MultiFramedRTPSink::packFrame() {
// Get the next frame.
// First, skip over the space we'll use for any frame-specific header:
fCurFrameSpecificHeaderPosition = fOutBuf->curPacketSize();
fCurFrameSpecificHeaderSize = frameSpecificHeaderSize();
fOutBuf->skipBytes(fCurFrameSpecificHeaderSize);
fTotalFrameSpecificHeaderSizes += fCurFrameSpecificHeaderSize;
// See if we have an overflow frame that was too big for the last pkt
// 通常是不会走这个流程的
if (fOutBuf->haveOverflowData()) {
// Use this frame before reading a new one from the source
unsigned frameSize = fOutBuf->overflowDataSize();
struct timeval presentationTime = fOutBuf->overflowPresentationTime();
unsigned durationInMicroseconds = fOutBuf->overflowDurationInMicroseconds();
fOutBuf->useOverflowData();
afterGettingFrame1(frameSize, 0, presentationTime, durationInMicroseconds);
}
else {
// Normal case: we need to read a new frame from the source
//这里的fSource为H264or5Fragmenter,在接下来的第11个步骤获取到数据之后会
//执行MultiFramedRTPSink::afterGettingFrame函数的
if (fSource == NULL) return;
fSource->getNextFrame(fOutBuf->curPtr(), fOutBuf->totalBytesAvailable(),
afterGettingFrame, this, ourHandleClosure, this);
}
}
6.5.接下来使用H264or5Fragmenter::doGetNextFrame()构造一个分片给Sink对象。
void H264or5Fragmenter::doGetNextFrame() {
if (fNumValidDataBytes == 1) {
// We have no NAL unit data currently in the buffer. Read a new one:
//fInputSource为H264VideoStreamFramer对象
fInputSource->getNextFrame(&fInputBuffer[1], fInputBufferSize - 1,
afterGettingFrame, this,
FramedSource::handleClosure, this);
}
else {
// We have NAL unit data in the buffer. There are three cases to consider:
// 1. There is a new NAL unit in the buffer, and it's small enough to deliver
// to the RTP sink (as is).
// 2. There is a new NAL unit in the buffer, but it's too large to deliver to
// the RTP sink in its entirety. Deliver the first fragment of this data,
// as a FU packet, with one extra preceding header byte (for the "FU header").
// 3. There is a NAL unit in the buffer, and we've already delivered some
// fragment(s) of this. Deliver the next fragment of this data,
// as a FU packet, with two (H.264) or three (H.265) extra preceding header bytes
// (for the "NAL header" and the "FU header").
if (fMaxSize < fMaxOutputPacketSize) { // shouldn't happen
envir() << "H264or5Fragmenter::doGetNextFrame(): fMaxSize ("
<< fMaxSize << ") is smaller than expected\n";
}
else {
fMaxSize = fMaxOutputPacketSize;
}
fLastFragmentCompletedNALUnit = True; // by default
if (fCurDataOffset == 1) { // case 1 or 2
if (fNumValidDataBytes - 1 <= fMaxSize) { // case 1
memmove(fTo, &fInputBuffer[1], fNumValidDataBytes - 1);
fFrameSize = fNumValidDataBytes - 1;
fCurDataOffset = fNumValidDataBytes;
}
else { // case 2
// We need to send the NAL unit data as FU packets. Deliver the first
// packet now. Note that we add "NAL header" and "FU header" bytes to the front
// of the packet (overwriting the existing "NAL header").
if (fHNumber == 264) {
fInputBuffer[0] = (fInputBuffer[1] & 0xE0) | 28; // FU indicator
fInputBuffer[1] = 0x80 | (fInputBuffer[1] & 0x1F); // FU header (with S bit)
}
else { // 265
u_int8_t nal_unit_type = (fInputBuffer[1] & 0x7E) >> 1;
fInputBuffer[0] = (fInputBuffer[1] & 0x81) | (49 << 1); // Payload header (1st byte)
fInputBuffer[1] = fInputBuffer[2]; // Payload header (2nd byte)
fInputBuffer[2] = 0x80 | nal_unit_type; // FU header (with S bit)
}
memmove(fTo, fInputBuffer, fMaxSize);
fFrameSize = fMaxSize;
fCurDataOffset += fMaxSize - 1;
fLastFragmentCompletedNALUnit = False;
}
}
else { // case 3
// We are sending this NAL unit data as FU packets. We've already sent the
// first packet (fragment). Now, send the next fragment. Note that we add
// "NAL header" and "FU header" bytes to the front. (We reuse these bytes that
// we already sent for the first fragment, but clear the S bit, and add the E
// bit if this is the last fragment.)
unsigned numExtraHeaderBytes;
if (fHNumber == 264) {
fInputBuffer[fCurDataOffset - 2] = fInputBuffer[0]; // FU indicator
fInputBuffer[fCurDataOffset - 1] = fInputBuffer[1] & ~0x80; // FU header (no S bit)
numExtraHeaderBytes = 2;
}
else { // 265
fInputBuffer[fCurDataOffset - 3] = fInputBuffer[0]; // Payload header (1st byte)
fInputBuffer[fCurDataOffset - 2] = fInputBuffer[1]; // Payload header (2nd byte)
fInputBuffer[fCurDataOffset - 1] = fInputBuffer[2] & ~0x80; // FU header (no S bit)
numExtraHeaderBytes = 3;
}
unsigned numBytesToSend = numExtraHeaderBytes + (fNumValidDataBytes - fCurDataOffset);
if (numBytesToSend > fMaxSize) {
// We can't send all of the remaining data this time:
numBytesToSend = fMaxSize;
fLastFragmentCompletedNALUnit = False;
}
else {
// This is the last fragment:
fInputBuffer[fCurDataOffset - 1] |= 0x40; // set the E bit in the FU header
fNumTruncatedBytes = fSaveNumTruncatedBytes;
}
memmove(fTo, &fInputBuffer[fCurDataOffset - numExtraHeaderBytes], numBytesToSend);
fFrameSize = numBytesToSend;
fCurDataOffset += numBytesToSend - numExtraHeaderBytes;
}
if (fCurDataOffset >= fNumValidDataBytes) {
// We're done with this data. Reset the pointers for receiving new data:
fNumValidDataBytes = fCurDataOffset = 1;
}
// Complete delivery to the client:
//会调用到MultiFramedRTPSink::afterGettingFrame
FramedSource::afterGetting(this);
}
}
H264or5Fragmenter::doGetNextFrame() 主要工作就2个,但是代码需要好好理解,主要是各种缓冲区的作用,关于H264,如果封装成RTP并发送,可以参考协议RFC3984的打包格式章节。
1.如果缓冲区没有数据,则通过H264VideoStreamFramer对象的getNextFrame方法继续获取数据
2.如果存在数据,则设置要帧的发送缓冲区,主要存在3种情况,
(1)获取到一个新帧,这一帧足够小,则将这一帧全部发送
(2)获取到一个新帧,如果获取到的帧不能一次发送,则只发送帧的一部分
(3)如果帧已经存在(即已经发送了该帧的前半部分),则剩余帧的处理方式
3.处理好帧的缓冲区之后,则调用FramedSource::afterGetting(this)处理回调
在该接口会调用MultiFramedRTPSink::afterGettingFrame函数
6.6. 如果H264or5Fragmenter::doGetNextFrame()的缓冲区没有数据,则调用MPEGVideoStreamFramer::doGetNextFrame()继续获取数据,最终会调用到MPEGVideoStreamFramer::continueReadProcessing().
void MPEGVideoStreamFramer::continueReadProcessing() {
//解析器会返回真实的一帧大小,如果没有获取到一帧,则返回0
unsigned acquiredFrameSize = fParser->parse();
if (acquiredFrameSize > 0) {//获取到一帧
// We were able to acquire a frame from the input.
// It has already been copied to the reader's space.
fFrameSize = acquiredFrameSize;
fNumTruncatedBytes = fParser->numTruncatedBytes();
// "fPresentationTime" should have already been computed.
// Compute "fDurationInMicroseconds" now:
fDurationInMicroseconds
= (fFrameRate == 0.0 || ((int)fPictureCount) < 0) ? 0
: (unsigned)((fPictureCount * 1000000) / fFrameRate);
#ifdef DEBUG
fprintf(stderr, "%d bytes @%u.%06d, fDurationInMicroseconds: %d ((%d*1000000)/%f)\n", acquiredFrameSize, fPresentationTime.tv_sec, fPresentationTime.tv_usec, fDurationInMicroseconds, fPictureCount, fFrameRate);
#endif
fPictureCount = 0;
// Call our own 'after getting' function. Because we're not a 'leaf'
// source, we can call this directly, without risking infinite recursion.
//以H264传输为例,afterGetting会执行H264or5Fragmenter::afterGettingFrame回调
afterGetting(this);
}
else {
// We were unable to parse a complete frame from the input, because:
// - we had to read more data from the source stream, or
// - the source stream has ended.
//这里的注释说的很明白,就是解析器无法解析出完整的一帧,需要从数据流中获取到更多的数据
//或者数据流已经到达末尾
}
MPEGVideoStreamFramer::continueReadProcessing()会调用H264or5VideoStreamParser::parse()解析出一帧H264数据,如果解析失败,则继续解析,如果解析成功,则调用afterGetting(this),并通过H264or5Fragmenter::afterGettingFrame返回数据。
6.7.H264or5VideoStreamParser::parse()解析一帧H264数据
unsigned H264or5VideoStreamParser::parse() {
try {
// The stream must start with a 0x00000001:
if (!fHaveSeenFirstStartCode) {
// Skip over any input bytes that precede the first 0x00000001:
u_int32_t first4Bytes;
while ((first4Bytes = test4Bytes()) != 0x00000001) {
get1Byte(); setParseState(); // ensures that we progress over bad data
}
skipBytes(4); // skip this initial code
setParseState();
fHaveSeenFirstStartCode = True; // from now on
}
.......
//读取一帧到缓冲区
//解析帧的类型
fHaveSeenFirstByteOfNALUnit = False; // for the next NAL unit that we'll parse
u_int8_t nal_unit_type;
if (fHNumber == 264) {
nal_unit_type = fFirstByteOfNALUnit & 0x1F;
#ifdef DEBUG
u_int8_t nal_ref_idc = (fFirstByteOfNALUnit & 0x60) >> 5;
fprintf(stderr, "Parsed %d-byte NAL-unit (nal_ref_idc: %d, nal_unit_type: %d (\"%s\"))\n",
curFrameSize() - fOutputStartCodeSize, nal_ref_idc, nal_unit_type, nal_unit_type_description_h264[nal_unit_type]);
#endif
}
else { // 265
nal_unit_type = (fFirstByteOfNALUnit & 0x7E) >> 1;
#ifdef DEBUG
fprintf(stderr, "Parsed %d-byte NAL-unit (nal_unit_type: %d (\"%s\"))\n",
curFrameSize() - fOutputStartCodeSize, nal_unit_type, nal_unit_type_description_h265[nal_unit_type]);
#endif
}
// Now that we have found (& copied) a NAL unit, process it if it's of special interest to us:
if (isVPS(nal_unit_type)) { // Video parameter set
// First, save a copy of this NAL unit, in case the downstream object wants to see it:
usingSource()->saveCopyOfVPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);
if (fParsedFrameRate == 0.0) {
// We haven't yet parsed a frame rate from the stream.
// So parse this NAL unit to check whether frame rate information is present:
unsigned num_units_in_tick, time_scale;
analyze_video_parameter_set_data(num_units_in_tick, time_scale);
if (time_scale > 0 && num_units_in_tick > 0) {
usingSource()->fFrameRate = fParsedFrameRate
= time_scale / (DeltaTfiDivisor * num_units_in_tick);
#ifdef DEBUG
fprintf(stderr, "Set frame rate to %f fps\n", usingSource()->fFrameRate);
#endif
}
else {
#ifdef DEBUG
fprintf(stderr, "\tThis \"Video Parameter Set\" NAL unit contained no frame rate information, so we use a default frame rate of %f fps\n", usingSource()->fFrameRate);
#endif
}
}
}
else if (isSPS(nal_unit_type)) { // Sequence parameter set
// First, save a copy of this NAL unit, in case the downstream object wants to see it:
usingSource()->saveCopyOfSPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);
if (fParsedFrameRate == 0.0) {
// We haven't yet parsed a frame rate from the stream.
// So parse this NAL unit to check whether frame rate information is present:
unsigned num_units_in_tick, time_scale;
analyze_seq_parameter_set_data(num_units_in_tick, time_scale);
if (time_scale > 0 && num_units_in_tick > 0) {
usingSource()->fFrameRate = fParsedFrameRate
= time_scale / (DeltaTfiDivisor * num_units_in_tick);
#ifdef DEBUG
fprintf(stderr, "Set frame rate to %f fps\n", usingSource()->fFrameRate);
#endif
}
else {
#ifdef DEBUG
fprintf(stderr, "\tThis \"Sequence Parameter Set\" NAL unit contained no frame rate information, so we use a default frame rate of %f fps\n", usingSource()->fFrameRate);
#endif
}
}
}
else if (isPPS(nal_unit_type)) { // Picture parameter set
// Save a copy of this NAL unit, in case the downstream object wants to see it:
usingSource()->saveCopyOfPPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);
}
else if (isSEI(nal_unit_type)) { // Supplemental enhancement information (SEI)
analyze_sei_data(nal_unit_type);
// Later, perhaps adjust "fPresentationTime" if we saw a "pic_timing" SEI payload??? #####
}
......
}
catch (int /*e*/) {
#ifdef DEBUG
fprintf(stderr, "H264or5VideoStreamParser::parse() EXCEPTION (This is normal behavior - *not* an error)\n");
#endif
return 0; // the parsing got interrupted 走到这里,代表需要解析器从文件中读取数据
}
}
当我们第一次执行parse时,会走到以上我没省去的代码,当调用test4Bytes()会因为缓冲区无数据调用StreamParser::ensureValidBytes1()获取数据
void StreamParser::ensureValidBytes1(unsigned numBytesNeeded) {
......
//
fInputSource->getNextFrame(&curBank()[fTotNumValidBytes],
maxNumBytesToRead,
afterGettingBytes, this,
onInputClosure, this);
//这里直接抛出异常,代表需要从文件中获取数据
throw NO_MORE_BUFFERED_INPUT;
}
fInputSource为ByteStreamFileSource对象,调用getNextFrame接口之后,最终会调用ByteStreamFileSource::doGetNextFrame()从文件中获取数据,最后通过StreamParser::afterGettingBytes返回,在StreamParser::afterGettingBytes返回之前,StreamParser::ensureValidBytes1抛出异常,H264or5VideoStreamParser::parse()会立即捕获到,然后返回0,说明此次没有数据供解析器解析。
ByteStreamFileSource::doGetNextFrame()函数会调用doReadFromFile()函数
void ByteStreamFileSource::doReadFromFile() {
......
// Inform the reader that he has data:
#ifdef READ_FROM_FILES_SYNCHRONOUSLY
// To avoid possible infinite recursion, we need to return to the event loop to do this:
nextTask() = envir().taskScheduler().scheduleDelayedTask(0,
(TaskFunc*)FramedSource::afterGetting, this);
#else
// Because the file read was done from the event loop, we can call the
// 'after getting' function directly, without risk of infinite recursion:
FramedSource::afterGetting(this);
#endif
}
不容易呀,经过层层封装,总算知道live555中要获取一帧H264数据需要经过sink->filter->source,然后再通过回调将数据从source->filter->sink到达顶层。
我们看看FramedSource::afterGetting的定义
void FramedSource::afterGetting(FramedSource* source) {
source->nextTask() = NULL;
source->fIsCurrentlyAwaitingData = False;
// indicates that we can be read again 表明我们可以再一次从该数据源中读取数据
// Note that this needs to be done here, in case the "fAfterFunc"
// called below tries to read another frame (which it usually will)
//一般这里的回调处理在处理数据之后会继续调用getNextFrame函数继续读取数据,直到数据读取完成
if (source->fAfterGettingFunc != NULL) {
(*(source->fAfterGettingFunc))(source->fAfterGettingClientData,
source->fFrameSize, source->fNumTruncatedBytes,
source->fPresentationTime,
source->fDurationInMicroseconds);
}
}
fIsCurrentlyAwaitingData这个变量如果为真,代表数据还没返回,调用者不可以在发出调用数据的请求。fAfterGettingFunc在getNextFrame接口中被赋值,该调用在StreamParser::ensureValidBytes1函数中,可以知道从ByteStreamFileSource获取到数据之后会执行StreamParser::afterGettingBytes的函数,然后执行StreamParser::afterGettingBytes1.
void StreamParser::afterGettingBytes1(unsigned numBytesRead, struct timeval presentationTime) {
........
fClientContinueFunc(fClientContinueClientData, ptr, numBytesRead, presentationTime);
}
fClientContinueFunc是在StreamParser解析到数据之后的继续处理回调,这个函数是在StreamParser的构造函数中被赋值的,在H264的解析中,对应的H264or5VideoStreamParser对象,这个对象在构建H264VideoStreamFramer对象时被创建,H264or5VideoStreamParser又继承MPEGVideoStreamParser类,这时候就知道fClientContinueFunc为MPEGVideoStreamFramer::continueReadProcessing,这不回到流程6.6了吗,对的,回到流程6.6后,这次又会调用parser->parse继续解析,这次就能解析到一帧数据了,并且将DESCEIBE请求需要H264的PPS,SPS数据使用saveCopyOfSPS,saveCopyOfPPS保存下来,然后调用afterGetting(this),然后会调用H264or5Fragmenter::afterGettingFrame,最终会走到流程6.5的else的最后一行FramedSource::afterGetting(this),最后执行MultiFramedRTPSink::afterGettingFrame(),由于这里不是讲解PLAY流程,就不对MultiFramedRTPSink::afterGettingFrame()做讲解。
获取到H264 SDP需要的PPS和SPS之后,我们在回到流程6.2看看,流程6.2的
checkForAuxSDPLine会检测是否获取到相应的PPS和SPS数据
static void checkForAuxSDPLine(void* clientData) {
H264VideoFileServerMediaSubsession* subsess = (H264VideoFileServerMediaSubsession*)clientData;
subsess->checkForAuxSDPLine1();
}
void H264VideoFileServerMediaSubsession::checkForAuxSDPLine1() {
nextTask() = NULL;
char const* dasl;
if (fAuxSDPLine != NULL) {//存在sdp
// Signal the event loop that we're done:
setDoneFlag();
} else if (fDummyRTPSink != NULL && (dasl = fDummyRTPSink->auxSDPLine()) != NULL) {//获取到sdp
fAuxSDPLine = strDup(dasl);
fDummyRTPSink = NULL;
// Signal the event loop that we're done: 设置事件循环可以结束
setDoneFlag();
} else if (!fDoneFlag) {
// try again after a brief delay:
int uSecsToDelay = 100000; // 100 ms
nextTask() = envir().taskScheduler().scheduleDelayedTask(uSecsToDelay,
(TaskFunc*)checkForAuxSDPLine, this);
}
}
setDoneFlag()用于设置fDoneFlag标志, 这样事件循环envir().taskScheduler().doEventLoop(&fDoneFlag)就可以停止,继续下一步的工作。fDummyRTPSink->auxSDPLine()用于获取sdp,其对应实现为
char const* H264VideoRTPSink::auxSDPLine() {
// Generate a new "a=fmtp:" line each time, using our SPS and PPS (if we have them),
// otherwise parameters from our framer source (in case they've changed since the last time that
// we were called):
H264or5VideoStreamFramer* framerSource = NULL;
u_int8_t* vpsDummy = NULL; unsigned vpsDummySize = 0;
u_int8_t* sps = fSPS; unsigned spsSize = fSPSSize;
u_int8_t* pps = fPPS; unsigned ppsSize = fPPSSize;
if (sps == NULL || pps == NULL) {
// We need to get SPS and PPS from our framer source:
if (fOurFragmenter == NULL) return NULL; // we don't yet have a fragmenter (and therefore not a source)
framerSource = (H264or5VideoStreamFramer*)(fOurFragmenter->inputSource());
if (framerSource == NULL) return NULL; // we don't yet have a source
//在parse被设置
framerSource->getVPSandSPSandPPS(vpsDummy, vpsDummySize, sps, spsSize, pps, ppsSize);
if (sps == NULL || pps == NULL) return NULL; // our source isn't ready
}
// Set up the "a=fmtp:" SDP line for this stream:去除编码多余的字节(0X000003中的03)-03是添加的字节
u_int8_t* spsWEB = new u_int8_t[spsSize]; // "WEB" means "Without Emulation Bytes"
unsigned spsWEBSize = removeH264or5EmulationBytes(spsWEB, spsSize, sps, spsSize);
if (spsWEBSize < 4) { // Bad SPS size => assume our source isn't ready
delete[] spsWEB;
return NULL;
}
u_int32_t profileLevelId = (spsWEB[1]<<16) | (spsWEB[2]<<8) | spsWEB[3];
delete[] spsWEB;
char* sps_base64 = base64Encode((char*)sps, spsSize);
char* pps_base64 = base64Encode((char*)pps, ppsSize);
char const* fmtpFmt =
"a=fmtp:%d packetization-mode=1"
";profile-level-id=%06X"
";sprop-parameter-sets=%s,%s\r\n";
unsigned fmtpFmtSize = strlen(fmtpFmt)
+ 3 /* max char len */
+ 6 /* 3 bytes in hex */
+ strlen(sps_base64) + strlen(pps_base64);
char* fmtp = new char[fmtpFmtSize];
sprintf(fmtp, fmtpFmt,
rtpPayloadType(),
profileLevelId,
sps_base64, pps_base64);
delete[] sps_base64;
delete[] pps_base64;
delete[] fFmtpSDPLine; fFmtpSDPLine = fmtp;
return fFmtpSDPLine;
}
7 发送DESCRIBE响应给客户端
响应内容如下:
RTSP/1.0 200 OK
CSeq: 2
Date: Fri, May 20 2022 02:24:43 GMT
Content-Base: rtsp://192.168.56.1:8554/h264ESVideoTest/
Content-Type: application/sdp
Content-Length: 530
v=0
o=- 1653013481911459 1 IN IP4 192.168.56.1
s=Session streamed by "testOnDemandRTSPServer"
i=h264ESVideoTest
t=0 0
a=tool:LIVE555 Streaming Media v2022.04.26
a=type:broadcast
a=control:*
a=range:npt=now-
a=x-qt-text-nam:Session streamed by "testOnDemandRTSPServer"
a=x-qt-text-inf:h264ESVideoTest
m=video 0 RTP/AVP 96
c=IN IP4 0.0.0.0
b=AS:500
a=rtpmap:96 H264/90000
a=fmtp:96 packetization-mode=1;profile-level-id=64001F;sprop-parameter-sets=Z2QAH6yyAeBr8v/gIgAiIgAAAwACAAADAHgeMGSQ,aOvDyyLA
a=control:track1
客户端处理DESCRIBE响应
客户端使用handleResponseBytes处理服务器端的响应
1.判断是否发生错误,如果接收缓冲区不足,则移除最近等待请求,并对这个请求执行错误回调错误,如果发生了网络错误,则移除所有的等待请求,并对这些请求发送错误回调错误,否则执行2.
2.设置接收缓冲区数据
3.解析这次请求是否是一次完全的请求(解析所有的头并且有一行空行即是完整的请求),如果不是,直接退出,是,则执行4.
4.解析所有的头信息,从statuscode行开始,然后是其他头
4.1解析statuscode失败,解析该响应是否是服务器发送的请求,如果是,则可以直接说明未实现即可,否则不做任何操作,并跳转到6.如果解析statuscode成功则执行4.2
4.2循环处理剩下的行,直到处理完所有的头信息,主要包括CSeq,Content-Length,Content-Base,Session,
Transport,Scale,Speed,Range,RTP-Info,WWW-Authenticate,Public,Allow,Location,com.ses.streamID,Connection解析到以下头,
需要做一些处理如下,并且在解析头的过程中,判断是否解析遇到错误,如果遇到错误,则跳转到7.
(1)如果是CSeq头,解析出请求序列号,并根据序列号从等待响应的请求队列中找到对应的请求(假设为foundRequest)
(2)如果解析到Content-Base和Location头,则重置客户端访问服务器的URL地址。
(3)解析到Connection头,如果参数为Close,以表示响应完成后连接将关闭,这时候应该关闭客户端到服务器的连接。
5.解析响应内容
5.1如果该次响应没有接收到所有的响应内容,如果需要的字节数大于剩余的缓冲区,则执行7,否则将foundRequest放入到队列头,等到下次响应内容到达后一起处理。
5.2 判断是否有剩余的字节numExtraBytesAfterResponse>0
6.如果foundRequest不为空,根据不同的响应code,做不同的处理
6.1 如果code=200,则根据不同解析的头设置foundRequest的subsession,session的一些参数
6.2 如果是401或者301,302错误码,重新发送请求(这里会设置权限信息或者Location重新定位地址)。
7.如果响应数据还有多余的,则将多余的数据拷贝到缓冲起始位置,否则重置缓冲区
8.如果foundRequest和其处理函数不为空,则根据code执行相应的回调或者错误回调。如果响应数据有多余的,则回到3继续执行。
执行第8步的时候,这句代码(*foundRequest->handler())(this, resultCode, resultString)就会执行testRTSPClient的continueAfterDESCRIBE回调,其中resultString为SDP描述信息。
void continueAfterDESCRIBE(RTSPClient* rtspClient, int resultCode, char* resultString) {
do {
UsageEnvironment& env = rtspClient->envir(); // alias
StreamClientState& scs = ((ourRTSPClient*)rtspClient)->scs; // alias
if (resultCode != 0) {
env << *rtspClient << "Failed to get a SDP description: " << resultString << "\n";
delete[] resultString;
break;
}
char* const sdpDescription = resultString;
env << *rtspClient << "Got a SDP description:\n" << sdpDescription << "\n";
// Create a media session object from this SDP description:
scs.session = MediaSession::createNew(env, sdpDescription);
delete[] sdpDescription; // because we don't need it anymore
if (scs.session == NULL) {
env << *rtspClient << "Failed to create a MediaSession object from the SDP description: " << env.getResultMsg() << "\n";
break;
}
else if (!scs.session->hasSubsessions()) {
env << *rtspClient << "This session has no media subsessions (i.e., no \"m=\" lines)\n";
break;
}
// Then, create and set up our data source objects for the session. We do this by iterating over the session's 'subsessions',
// calling "MediaSubsession::initiate()", and then sending a RTSP "SETUP" command, on each one.
// (Each 'subsession' will have its own data source.)
scs.iter = new MediaSubsessionIterator(*scs.session);
setupNextSubsession(rtspClient);
return;
} while (0);
// An unrecoverable error occurred with this stream.
shutdownStream(rtspClient);
}
以上代码,没什么可说的,根据SDP创建对应MediaSession和MedisSubSession对象。并使用setupNextSubSession发送SETUP请求。
H264-获取SDP的startPlaying调用顺序
1. checkForAuxSDPLine
2. MediaSink::startPlaying
3. H264or5VideoRTPSink::continuePlaying():创建fOurFragmenter
4. MultiFramedRTPSink::continuePlaying()
5. MultiFramedRTPSink::buildAndSendPacket(Boolean isFirstPacket)
6. MultiFramedRTPSink::packFrame()
7. FramedSource::getNextFrame()//H264or5Fragmenter对象fOurFragmenter
8. H264or5Fragmenter::doGetNextFrame()
9. FramedSource::getNextFrame//H264VideoStreamFramer对象
10. MPEGVideoStreamFramer::doGetNextFrame()
11. MPEGVideoStreamFramer::continueReadProcessing()
12. H264or5VideoStreamParser::parse()
13. StreamParser::ensureValidBytes1(unsigned numBytesNeeded)
14. FramedSource::getNextFrame()//ByteStreamFileSource对象
15. ByteStreamFileSource::doGetNextFrame()
16. ByteStreamFileSource::doReadFromFile()
//以下是回调过程
17. FramedSource::afterGetting//ByteStreamFileSource
18. StreamParser::afterGettingBytes()
19. MPEGVideoStreamFramer::continueReadProcessing()
20. H264or5VideoStreamParser::parse()
21. FramedSource::afterGetting()//H264VideoStreamFramer对象
22. H264or5Fragmenter::afterGettingFrame()
23. FramedSource::afterGetting //H264or5Fragmenter
24. MultiFramedRTPSink::afterGettingFrame
在以上步骤到达步骤20后,就获取到了H264的SDP描述,checkForAuxSDPLine就能获取相应的SDP描述后,设置fDoneFlag标志,退出事件循环,构造响应,最后发送响应给客户端。
PS:如果本文对你了解live555有一定帮助,点赞支持是继续写下去的动力