RFC3550-RTP/RTCP中英文对照前4章

abcd552191868

已于 2022-04-22 13:49:45 修改

阅读量2.5k

点赞数

分类专栏：网络相关文章标签：网络协议网络 RFC3550 RTP RTSP

于 2022-04-22 12:09:48 首次发布

网络相关专栏收录该内容

6 篇文章 1 订阅

订阅专栏

说明

本文是RFC3550的中英文对照文档，其中添加了部分自己的理解，可能不正确，网友不用理会即可。

Abstract 摘要

This memorandum describes RTP, the real-time transport protocol. RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of-service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers.
本文档描述了实时传输协议RTP。RTP提供端到端网络传输功能，适用于通过多播或单播网络服务传输实时数据（如音频、视频或模拟数据）的应用程序。RTP不能解决资源预留问题，也不能保证实时服务的服务质量。数据传输通过控制协议（RTCP）来提供这些服务，以允许以可扩展到大型多播网络的方式监视数据传送，并提供最小的控制和识别功能。RTP和RTCP被设计成独立于底层传输层和网络层。该协议支持使用RTP级转换器和混频器。

1.Introduction 简介

This memorandum specifies the real-time transport protocol (RTP), which provides end-to-end delivery services for data with real-time characteristics, such as interactive audio and video. Those services include payload type identification, sequence numbering, timestamping and delivery monitoring. Applications typically run RTP on top of UDP to make use of its multiplexing and checksum services; both protocols contribute parts of the transport protocol functionality. However, RTP may be used with other suitable underlying network or transport protocols (see Section 11). RTP supports data transfer to multiple destinations using multicast distribution if provided by the underlying network.
本文详细的介绍实时传输协议RTP，RTP提供带有实时特性的端对端数据传输服务，传输的数据如：交互式的音频和视频。这些服务包括有效载荷类型定义，序列号，时间戳和传输监测控制。应用程序在UDP上运行RTP来使用它的多路技术和校验和服务。2种协议都供传输协议的部分功能。不过，RTP可能被其他适当的下层网络和传输协议使用（见11节）。如果下层网络支持，RTP支持数据使用多播分发机制转发到多个目的地。

Note that RTP itself does not provide any mechanism to ensure timely delivery or provide other quality-of-service guarantees, but relies on lower-layer services to do so. It does not guarantee delivery or prevent out-of-order delivery, nor does it assume that the underlying network is reliable and delivers packets in sequence. The sequence numbers included in RTP allow the receiver to reconstruct the sender’s packet sequence, but sequence numbers might also be used to determine the proper location of a packet, for example in video decoding, without necessarily decoding packets in sequence.
注意RTP本身没有提供任何的机制来确保实时的传输或其他的服务质量保证，而是由低层的服务来完成。它不保证传输或防止乱序传输，它不假定下层网络是否可靠，是否按顺序传送数据包。RTP包含的序列号允许接受方重构发送方的数据包顺序，但序列号也用来确定一个数据包的正确位置，例如，在视频解码的时候不用按顺序的对数据包进行解码。（笔录：根据序列号来对发送的数据进行排序）

While RTP is primarily designed to satisfy the needs of multi-participant multimedia conferences, it is not limited to that particular application. Storage of continuous data, interactive distributed simulation, active badge, and control and measurement applications may also find RTP applicable.
虽然RTP起初的设计是用来满足多参与者的多媒体会议的需要，但是它没有限定于专门的应用。连续数据的储存，交互分布式仿真，动态标记，以及控制和测量应用程序也可能会适合使用RTP。

This document defines RTP, consisting of two closely-linked parts:
本文档定义了RTP由两个紧密关联的部分组成：

the real-time transport protocol (RTP), to carry data that has real-time properties.
实时传输协议（RTP），以传输具有实时属性的数据。

the RTP control protocol (RTCP), to monitor the quality of service and to convey information about the participants in an on-going session. The latter aspect of RTCP may be sufficient for “loosely controlled” sessions, i.e., where there is no explicit membership control and set-up, but it is not necessarily intended to support all of an application’s control communication requirements. This functionality may be fully or partially subsumed by a separate session control protocol, which is beyond the scope of this document.
RTP控制协议RTCP，用于监控服务质量和传达关于在一个正在进行的会议中的参与者的信息。后者对“宽松控制”的会议可能已经足够，但是并没有必要去支持一个应用程序所有的通信控制条件。这个功能可能充分的或者部分的被一个单独的会议控制协议所包含，这超过了本文档的范围。

RTP represents a new style of protocol following the principles of application level framing and integrated layer processing proposed by Clark and Tennenhouse [10]. That is, RTP is intended to be malleable to provide the information required by a particular application and will often be integrated into the application processing rather than being implemented as a separate layer. RTP is a protocol framework that is deliberately not complete. This document specifies those functions expected to be common across all the applications for which RTP would be appropriate. Unlike conventional protocols in which additional functions might be accommodated by making the protocol more general or by adding an option mechanism that would require parsing, RTP is intended to be tailored through modifications and/or additions to the headers as needed. Examples are given in Sections 5.3 and 6.4.3.
RTP代表了协议的一种新的类型，该协议的类型由Clark和Tennenhouse提出[10]，遵循应用级（framing）框架和（integrated layer processing）统一层处理的原则。就是说，RTP被规定为可扩展的，用来提供一个专门的应用程序需要的信息，并将会经常性的被归并到应用程序的处理中，而不是作为一个单独的层被实现。RTP只是一个故意不完成的协议框架。本文档详细说明那些功能，希望这些功能能够普遍贯穿于所有适合使用RTP的应用程序。和常规的协议不同，额外的功能可能通过完善协议本身或者增加一个可能需要分析的选项机制来增加，RTP被规定为可以根据需要通过修改和/或增加操作，“剪裁”到报头。具体的例子见5.3和6.4.3节。

Therefore, in addition to this document, a complete specification of RTP for a particular application will require one or more companion documents (see Section 13):
因此，除了本文档，一个完整的RTP应用将还需要一个或者更多的同类文档（参见第13节）：

a profile specification document, which defines a set of payload type codes and their mapping to payload formats (e.g., media encodings). A profile may also define extensions or modifica- tions to RTP that are specific to a particular class of applications. Typically an application will operate under only one profile. A profile for audio and video data may be found in the companion RFC 3551 [1].
• 一种配置文件规范文档，它定义了一组有效负载类型代码及其到有效负载格式的映射（例如，媒体编码）。配置文件还可以定义特定于特定应用程序类的RTP扩展或修改。通常一个应用程序只需要在一个配置文件。音频和视频数据的配置文件可以在配套的RFC 3551[1]中找到。

payload format specification documents, which define how a particular payload, such as an audio or video encoding, is to be carried in RTP.
• 有效载荷格式规范文件，其中定义了如何在RTP中携带特定的有效载荷，例如音频或视频编码。

A discussion of real-time services and algorithms for their implementation as well as background discussion on some of the RTP design decisions can be found in [11].
关于实时服务及其实现算法的讨论，以及一些RTP设计决策的背景讨论，见[11]。

1.1 Terminology 术语

The key words “must”, “must not”, “required”, “shall”, “shall not”, “should”, “should not”, “recommended”, “may”, and “optional” in this document are to be interpreted as de- scribed in BCP 14, RFC 2119 [2] and indicate requirement levels for compliant RTP implementa-
tions.
本文件中的关键词“必须”、“一定不能”、“必需的”、“会”、“不会”、“应该”、“不应该”、“推荐”、“可能”和“可选”应按照BCP14、RFC2119[2]所述进行解释，并说明符合RTP实施的要求级别。

2.RTP Use Scenarios 使用场景

The following sections describe some aspects of the use of RTP. The examples were chosen to illustrate the basic operation of applications using RTP, not to limit what RTP may be used for. In these examples, RTP is carried on top of IP and UDP, and follows the conventions established by the profile for audio and video specified in the companion RFC 3551.
以下章节描述了用到RTP的一些方面.所举例子用来说明RTP应用的基本操作,但RTP的应用不限于此,在这些例子中,RTP运行于IP和UDP之上,并且遵循RFC3551所描述的音频和视频的配置文件中的约定.

2.1Simple Multicast Audio Conference 简单多播音频会议

A working group of the IETF meets to discuss the latest protocol document, using the IP multicast services of the Internet for voice communications. Through some allocation mechanism the working group chair obtains a multicast group address and pair of ports. One port is used for audio data, and the other is used for control (RTCP) packets. This address and port information is distributed to the intended participants. If privacy is desired, the data and control packets may be encrypted as specified in Section 9.1, in which case an encryption key must also be generated and distributed. The exact details of these allocation and distribution mechanisms are beyond the scope of RTP.
IETF的一个工作组开会讨论最新协议草案时，使用Internet的IP多播服务进行最新的语音通讯。通过某种分配机制，工作组中心分配了一个多播的组地址和一对端口。一个端口用于音频数据，另一个端口用于控制(RTCP)数据包。此地址和端口信息将被分发给目标参与者。如果需要保密，则可以按照9.1节的规定对数据和控制包进行加密，在这种情况下，还必须生成并分发加密密钥。这些分配和分配机制的确切细节超出了RTP的范围。

The audio conferencing application used by each conference participant sends audio data in small chunks of, say, 20 ms duration. Each chunk of audio data is preceded by an RTP header; RTP header and data are in turn contained in a UDP packet. The RTP header indicates what type of audio encoding (such as PCM, ADPCM or LPC) is contained in each packet so that senders can change the encoding during a conference, for example, to accommodate a new participant that is connected through a low-bandwidth link or react to indications of network congestion.
每个与会者所使用的音频会议应用程序，都以小块形式（比方说持续20毫秒时间）来发送音频数据。每个音频数据块前面都有RTP头；RTP头和数据依次包含在UDP数据包中。RTP头指示每个分组中包含什么类型的音频编码（例如PCM、ADPCM或LPC），以便发送方可以在会议期间更改编码，例如，要加进一个低带宽接入的参与者，或是要应付网络拥塞。

The Internet, like other packet networks, occasionally loses and reorders packets and delays them by variable amounts of time. To cope with these impairments, the RTP header contains timing information and a sequence number that allow the receivers to reconstruct the timing produced by the source, so that in this example, chunks of audio are contiguously played out the speaker every 20 ms. This timing reconstruction is performed separately for each source of RTP packets in the conference. The sequence number can also be used by the receiver to estimate how many packets are being lost.
互联网和其他分组网络一样，偶尔会丢失和重新排序分组，造成时长不等的延迟。为了弥补这些不足，RTP报头里包含了计时信息和一个序列号，允许接收方重建来自数据源的计时信息，比如上面提及的音频块以20ms的间隔在扬声器中连续播放。会议中，对每个RTP包的源,单独地实施计时重建。序列号还被接收方用来评估丢失包数目。

Since members of the working group join and leave during the conference, it is useful to know who is participating at any moment and how well they are receiving the audio data. For that purpose, each instance of the audio application in the conference periodically multicasts a reception report plus the name of its user on the RTCP (control) port. The reception report indicates how well the current speaker is being received and may be used to control adaptive encodings. In addition to the user name, other identifying information may also be included subject to control bandwidth limits. A site sends the RTCP BYE packet (Section 6.6) when it leaves the conference.
由于会议期间不断有工作组成员加入或者离开，因此有必要知道任一时刻的实际参与者及他们接受音频数据的状况好坏，出于这个目的，会议中的每个音频应用程序的实例，都在RTCP（控制）端口上周期性地多播一个附加用户名的接收报告，接收报告（reception report ）指明了当前说话者被收听到的状况，可用于控制自适应性编码，除了用户名外，根据宽带的情况，可以包含其他标识信息。一个站点在离开会议时发送RTCP BYE包（章节6.5）。

2.2Audio and Video Conference 音视频会议

If both audio and video media are used in a conference, they are transmitted as separate RTP sessions. That is, separate RTP and RTCP packets are transmitted for each medium using two different UDP port pairs and/or multicast addresses. There is no direct coupling at the RTP level between the audio and video sessions, except that a user participating in both sessions should use the same distinguished (canonical) name in the RTCP packets for both so that the sessions can be associated.
如果在会议中同时使用音频和视频媒体，它们将作为单独的RTP会话进行传输。也就是说，使用两个不同的UDP端口对和/或多播地址为每个媒体传输单独的RTP和RTCP包。除了参与这两个会话的用户应该在RTCP包中为这两个会话使用相同的可分辨（规范）名称，以便音频和视频会话可以关联,除此之外音频和视频会话之间在RTP级别没有直接耦合。

One motivation for this separation is to allow some participants in the conference to receive only one medium if they choose. Further explanation is given in Section 5.2. Despite the separation, synchronized playback of a source’s audio and video can be achieved using timing information carried in the RTCP packets for both sessions.
这种分离的一个动机是，如果会议的一些参与者希望接受一种媒体(音频或者视频)，那么这样是可以做到的。第5.2节给出了进一步的解释。尽管存在分离，但是可以使用两个会话的RTCP包中携带的定时信息来实现数据源的音视频的同步回放(笔录:如何实现同步?根据时间戳,时间戳使用的是什么时间?)

2.3Mixers and Translators 混频器和翻译器

So far, we have assumed that all sites want to receive media data in the same format. However, this may not always be appropriate. Consider the case where participants in one area are connected through a low-speed link to the majority of the conference participants who enjoy high-speed net- work access. Instead of forcing everyone to use a lower-bandwidth, reduced-quality audio encoding, an RTP-level relay called a mixer may be placed near the low-bandwidth area. This mixer resyn- chronizes incoming audio packets to reconstruct the constant 20 ms spacing generated by the sender, mixes these reconstructed audio streams into a single stream, translates the audio encoding to a lower-bandwidth one and forwards the lower-bandwidth packet stream across the low-speed link. These packets might be unicast to a single recipient or multicast on a different address to multiple recipients. The RTP header includes a means for mixers to identify the sources that contributed to a mixed packet so that correct talker indication can be provided at the receivers.
到目前为止，我们假设所有用户(sites)都希望接收相同格式的媒体数据。然而这并不总是行得通。考虑这样一种情况，即一个区域的参与者通过低速链接连接到大多数享受高速网络访问的会议参与者。不必强迫每个人使用较低带宽、质量较低的音频编码，而是在低带宽区域附近放置一个称为混频器的RTP级中继。该混频器重新同步传入的音频包，以重建发送方生成的恒定20毫秒间隔，将这些重建的音频流混合为单个流，将音频编码转换为较低带宽，并通过低速链路转发较低带宽的数据包流。这些数据包可以单播到单个接收方，也可以在不同的地址上多播到多个接收方。RTP报头包括用于混频器的手段，以识别促成混合分组的源，从而可以在接收方处提供正确的发送者。(笔录:RTP包的头部分包含了一个字段,用于区分哪些是发送者的数据被组合到混频器中,接收者以此来得到是哪些发送者发送的数据)

Some of the intended participants in the audio conference may be connected with high bandwidth links but might not be directly reachable via IP multicast. For example, they might be behind an application-level firewall that will not let any IP packets pass. For these sites, mixing may not be necessary, in which case another type of RTP-level relay called a translator may be used.
音频会议中的某些预期参与者可能与高带宽链接相连，但可能无法通过IP多播直接访问。例如，它们可能位于不允许任何IP数据包通过的应用程序级防火墙后面。对于这些站点，可能不需要混频器，在这种情况下，可以使用另一种称为翻译器的RTP级中继。(理解:如果一些接收者的应用程序有防火墙,那么数据包可能就不能通过,这时候就需要翻译器 )

Two translators are installed, one on either side of the firewall, with the outside one funneling all multicast packets received through a secure connection to the translator inside the firewall. The translator inside the firewall sends them again as multicast packets to a multicast group restricted to the site’s internal network.
可以在防火墙的两侧分别安装一个翻译器，外部的转换器将通过安全连接接收到的所有多播数据包集中到防火墙内部的转换器。防火墙内部的转换器将它们作为多播数据包再次发送到仅限于站点内部网络的多播组（multicast group）

Mixers and translators may be designed for a variety of purposes. An example is a video mixer that scales the images of individual people in separate video streams and composites them into one video stream to simulate a group scene. Other examples of translation include the connection of a group of hosts speaking only IP/UDP to a group of hosts that understand only ST-II, or the packet-by- packet encoding translation of video streams from individual sources without resynchronization or mixing. Details of the operation of mixers and translators are given in Section 7.
混频器和翻译器可以设计为多种用途。例如一个视频混频器，它将单独视频流中的单个人的图像缩放，并将其组合成一个视频流，以模拟组场景。其他翻译器的示例包括将只发送IP/UDP包的主机组连接到仅理解ST-II的主机组(笔录翻译器作为中间件的作用)，或从单个源对视频流进行分组编码转换，而不需要重新同步或混合。混频器和翻译器的操作详情见第7节。

2.4Layered Encodings 分层编码

Multimedia applications should be able to adjust the transmission rate to match the capacity of the receiver or to adapt to network congestion. Many implementations place the responsibility of rate-adaptivity at the source. This does not work well with multicast transmission because of the conflicting bandwidth requirements of heterogeneous receivers. The result is often a least-common denominator scenario, where the smallest pipe in the network mesh dictates the quality and fidelity of the overall live multimedia “broadcast”.
多媒体应用程序应该能够调整传输速率以匹配接收端的容量或适应网络拥塞。许多实现都将速率自适应性的责任放在了源端(发送端)。由于异构接收方的带宽要求相互冲突，这不利于多播传输。其结果通常是一个最小公分母的场景，其中网络网格中最小的管道决定了整个实时多媒体“广播”的质量和保真度。(个人理解是传输速率不应该有发送端来控制,如果发送端来控制发送速率可能导致发送的速度以最小接受方可以接受的传输速率来发送)

Instead, responsibility for rate-adaptation can be placed at the receivers by combining a layered encoding with a layered transmission system. In the context of RTP over IP multicast, the source can stripe the progressive layers of a hierarchically represented signal across multiple RTP sessions each carried on its own multicast group. Receivers can then adapt to network heterogeneity and control their reception bandwidth by joining only the appropriate subset of the multicast groups.Details of the use of RTP with layered encodings are given in Sections 6.3.9, 8.3 and 11.
相反，通过将分层编码与分层传输系统相结合，速率自适应的责任可以交给接收方。在IP多播的RTP上下文中，源可以跨越多个RTP会话，每个会话都在其自己的多播组上进行。然后通过只加入组播组的适当子集来控制其接收带宽，接收方可以适应多种网络，使用具有分层编码的RTP的详细情况见第6.3.9、第8.3和第11节。

3.Definitions 定义

RTP payload: The data transported by RTP in a packet, for example audio samples or com- pressed video data. The payload format and interpretation are beyond the scope of this document.
RTP有效载荷：RTP在一个包中传输的数据，例如音频样本或压缩的视频数据。有效载荷的格式和解释超出了本文件的范围。
网站：RTP负载类型和媒体类型

RTP packet: A data packet consisting of the fixed RTP header, a possibly empty list of contribut- ing sources (see below), and the payload data. Some underlying protocols may require an encapsulation of the RTP packet to be defined. Typically one packet of the underlying pro- tocol contains a single RTP packet, but several RTP packets may be contained if permitted by the encapsulation method (see Section 11).
RTP数据包：由固定的RTP头、可能为空的贡献源列表（见下表）、以及负载数据(RTP payload)组成的数据包。某些下层协议可能要求对RTP数据包的封装进行定义。通常，下层协议的一个包包含单个RTP包，但如果封装方法允许，可以包含几个RTP包（参见第11节）

RTCP packet: A control packet consisting of a fixed header part similar to that of RTP data packets, followed by structured elements that vary depending upon the RTCP packet type. The formats are defined in Section 6. Typically, multiple RTCP packets are sent together as a compound RTCP packet in a single packet of the underlying protocol; this is enabled by the length field in the fixed header of each RTCP packet.
RTCP包：一种控制数据包，包括类似于RTP数据包的固定报头数据包，之后是根据RTCP包类型不同的结构化元素。这些格式在第6节中都有定义。通常，多个RTCP包将在一个下层协议的包中以合成；这是由每个RTCP包的固定头中的长度字段来实现的。

Port: The “abstraction that transport protocols use to distinguish among multiple destinations within a given host computer. TCP/IP protocols identify ports using small positive integers.”[12]The transport selectors (TSEL) used by the OSI transport layer are equivalent to ports. RTP depends upon the lower-layer protocol to provide some mechanism such as ports to multiplex the RTP and RTCP packets of a session.
端口：传输协议用来区分给定主机内多个目的地的“抽象”。TCP/IP协议使用小的正整数来识别端口。”[12]OSI传输层使用的传输选择器（TSEL）相当于端口。RTP依赖于较低层协议来提供一些机制，例如端口来复用会话的RTP和RTCP包（RTP和RTCP公用一个端口来传输数据，即端口复用）。

Transport address: The combination of a network address and port that identifies a transport- level endpoint, for example an IP address and a UDP port. Packets are transmitted from a source transport address to a destination transport address.
传输地址：网络地址和端口的组合，用来标识一个传输层次的终端，例如，一个IP地址和一个UDP端口。数据包从源传输地址传输到目标传输地址。

RTP media type: An RTP media type is the collection of payload types which can be carried within a single RTP session. The RTP Profile assigns RTP media types to RTP payload types.
RTP媒体类型：RTP媒体类型是可以在单个RTP会话中承载的有效负载类型的集合。RTP配置文件将RTP媒体类型分配给RTP有效负载类型。(笔录：一种媒体对应多种有效负载类型，可以通过配置文件来配置这个配置对应关系)
网站：RTP负载类型和媒体类型

Multimedia session: A set of concurrent RTP sessions among a common group of participants. For example, a videoconference (which is a multimedia session) may contain an audio RTP session and a video RTP session.
多媒体会话：一组公共参与者之间的一组并发的RTP会话。例如，视频会议（即多媒体会话）可以包含音频RTP会话和视频RTP会话。

RTP session: An association among a set of participants communicating with RTP. A participant may be involved in multiple RTP sessions at the same time. In a multimedia session, each medium is typically carried in a separate RTP session with its own RTCP packets unless the the encoding itself multiplexes multiple media into a single data stream. A participant distinguishes multiple RTP sessions by reception of different sessions using different pairs of destination transport addresses, where a pair of transport addresses comprises one network address plus a pair of ports for RTP and RTCP. All participants in an RTP session may share a common destination transport address pair, as in the case of IP multicast, or the pairs may be different for each participant, as in the case of individual unicast network addresses and port pairs. In the unicast case, a participant may receive from all other participants in the session using the same pair of ports, or may use a distinct pair of ports for each.
RTP会话：一组与RTP通信的参与者之间的关联。一个参与者可以同时参与多个RTP会话。在多媒体会话中，除非编码本身将多个媒体复用到单个数据流中，否则每个媒体通常在单独的RTP会话中携带其自己的RTCP包,参与者通过使用不同的目的地传输地址对接收不同的会话来区分多个RTP会话，其中一对传输地址包括一个网络地址加上一对用于RTP和RTCP的端口。(不同媒体数据的传输由不同的session进行传输,一个会话包括RTCP和RTP包,参与者某一时间可能有多个会话,这些会话通过源传输地址和目的传输地址来区分,一个传输地址由一个地址和一对端口组成)。RTP会话中的所有参与者可以共享公共目的地传输地址对，如在IP多播的情况下，或者对于每个参与者，该对可以不同，如在单个单播网络地址和端口对的情况下。在单播情况下，参与者可以使用相同的端口对从会话中的所有其他参与者接收，或者可以为每个参与者使用不同的端口对。

The distinguishing feature of an RTP session is that each maintains a full, separate space of SSRC identifiers (defined next). The set of participants included in one RTP session consists of those that can receive an SSRC identifier transmitted by any one of the participants either in RTP as the SSRC or a CSRC (also defined below) or in RTCP. For example, consider a three-party conference implemented using unicast UDP with each participant receiving from the other two on separate port pairs. If each participant sends RTCP feedback about data received from one other participant only back to that participant, then the conference is composed of three separate point-to-point RTP sessions. If each participant provides RTCP feedback about its reception of one other participant to both of the other participants, then the conference is composed of one multi-party RTP session. The latter case simulates the behavior that would occur with IP multicast communication among the three participants.(非常重要,是理解RCP会话的关键)
RTP会话的显著特征是，每个会话都维护一个完整的、单独的SSRC标识符空间（定义如下）。一个RTP会话中包含的一组参与者包括那些可以接收任何一个参与者在RTP中作为SSRC或CSRC（定义见下文）或RTCP传输的SSRC标识符的参与者。例如，考虑使用单播UDP实现的三方会议，每个参与者在单独的端口对上接收来自其他两个参与者的信息。如果每个参与者仅将从另一个参与者收到的数据的RTCP反馈发送回该参与者，则会议由三个独立的点对点RTP会话组成。如果每个参与者向另两个参与者提供关于其接收另一个参与者的RTCP反馈，则会议由一个多方RTP会话组成。后一种情况模拟了三个参与者之间IP多播通信的行为。（笔录：RTP包括了些什么内容，SSRC-同步源，RTP会话参与者的可以接受SSRC或CSRC的RTP或者RTCP的SSRC标志，重点理解IP多播通信行为-回复时可以将SSRC发给其他用户，这样就知道接收者当时与多少个用户在进行会话。）

The RTP framework allows the variations defined here, but a particular control protocol or application design will usually impose constraints on these variations.
RTP框架允许此处定义的变体，但是特定的控制协议或应用程序设计通常会对这些变体施加约束。（个人理解：RTP框架是不完整，需要不断完善，但是具体应用会对这些变化施加一定的约束，而不是随便更改）

Synchronization source (SSRC): The source of a stream of RTP packets, identified by a 32-bit numeric SSRC identifier carried in the RTP header so as not to be dependent upon the network address. All packets from a synchronization source form part of the same timing and sequence number space, so a receiver groups packets by synchronization source for playback. Examples of synchronization sources include the sender of a stream of packets derived from a signal source such as a microphone or a camera, or an RTP mixer (see below). A synchronization source may change its data format, e.g., audio encoding, over time. The SSRC identifier is a randomly chosen value meant to be globally unique within a particular RTP session (see Section 8). A participant need not use the same SSRC identifier for all the RTP sessions in a multimedia session; the binding of the SSRC identifiers is provided through RTCP (see Section 6.5.1). If a participant generates multiple streams in one RTP session, for example from separate video cameras, each must be identified as a different SSRC.
同步源(SSRC，Synchronization source)：RTP包流的源，用RTP报头中32位数值的 SSRC 标识符进行标识，使其不依赖于网络地址。一个同步源的所有包构成了相同计时（timing）和序列号空间的一部分（所有包构成了一个顺序流），这样接收方就可以把一个同步源的包组织在一起进行回放。举些同步源的例子，像来自同一信号源的包流的发送方，如麦克风、摄影机、RTP混频器（见下文）就是同步源。一个同步源可能随着时间变化而改变其数据格式，如音频编码。SSRC 标识符是一个随机选取的值，它在特定的 RTP 会话中是全局唯一（globally unique）的（见章节 8）。参与者并不需要在一个多媒体会议的所有 RTP 会话中，使用相同的 SSRC 标识符；通过RTCP绑定SSRC的标识符（见章节 6.5.1）。如果参与者在一个 RTP 会话中生成了多个流，例如来自多个摄影机，则每个摄影机都必须标识成单独的同步源。

Contributing source (CSRC): A source of a stream of RTP packets that has contributed to the combined stream produced by an RTP mixer (see below). The mixer inserts a list of the SSRC identifiers of the sources that contributed to the generation of a particular packet into the RTP header of that packet. This list is called the CSRC list. An example application is audio conferencing where a mixer indicates all the talkers whose speech was combined to produce the outgoing packet, allowing the receiver to indicate the current talker, even though all the audio packets contain the same SSRC identifier (that of the mixer).
贡献源（CSRC）：RTP数据包流的源，它对RTP混频器产生的组合流有所贡献（请参阅下文）,这样的源就叫做贡献源。混频器将有助于生成特定数据包的源的SSRC标识符列表插入到该数据包的RTP头中。此列表称为CSRC列表。一个示例应用是音频会议，其中混频器指示其语音被组合以产生传出分组的所有说话者，允许接收者指示当前的所有说话者，即使所有音频分组包含相同的SSRC标识符（混音器的标识符）(只作用于混频器,作用就是记录混频器中的所有源头,以便接收方知道有哪些发送者)。

End system: An application that generates the content to be sent in RTP packets and/or con- sumes the content of received RTP packets. An end system can act as one or more synchro- nization sources in a particular RTP session, but typically only one.
终端系统（End system)：生成RTP数据包中要发送的内容和/或对接收到的RTP数据包的内容进行汇总的应用程序。一个终端系统可以在一个特定的RTP会话中充当一个或多个同步源，但通常只有一个。

Mixer: An intermediate system that receives RTP packets from one or more sources, possibly changes the data format, combines the packets in some manner and then forwards a new RTP packet. Since the timing among multiple input sources will not generally be synchronized, the mixer will make timing adjustments among the streams and generate its own timing for the combined stream. Thus, all data packets originating from a mixer will be identified as having the mixer as their synchronization source.
混频器：一种中间系统，它从一个或多个源接收RTP数据包，可能改变数据格式，以某种方式组合数据包，然后转发一个新的RTP数据包。由于多个输入源之间的定时通常不会同步，因此混频器将在流之间进行定时调整，并为组合流生成其自己的定时。因此，来自混频器的所有数据分组将被标识为具有混频器作为其同步源（难道同步后的数据分组的同步源就是该混频器？）。

Translator: An intermediate system that forwards RTP packets with their synchronization source identifier intact. Examples of translators include devices that convert encodings without mixing, replicators from multicast to unicast, and application-level filters in firewalls.
翻译器（Translator)：一种中间系统，它转发 RTP包而不改变各包的同步源标识符。翻译器的例子如下：不作混频地转变编码的设备，把多播复制到单播的重复装置，以及防火墙里应用层次的过滤器。

Monitor: An application that receives RTCP packets sent by participants in an RTP session, in particular the reception reports, and estimates the current quality of service for distribution monitoring, fault diagnosis and long-term statistics. The monitor function is likely to be built into the application(s) participating in the session, but may also be a separate application that does not otherwise participate and does not send or receive the RTP data packets (since they are on a separate port). These are called third-party monitors. It is also acceptable for a third-party monitor to receive the RTP data packets but not send RTCP packets or otherwise be counted in the session.
监视器(Monitor)：一种应用程序，它接收RTP会话中参与者发送的RTCP包，特别是接收报告，并估计当前的服务质量，以便进行分布监视、故障诊断和长期统计。监视功能可能内置于参与会话的应用程序中，但也可能是一个单独的应用程序，它不以其他方式参与并且不发送或接收RTP数据包（因为它们位于单独的端口上）。这些被称为第三方监视器。第三方监视器也可以接收RTP数据包，但不发送RTCP包或在会话中计数。

Non-RTP means: Protocols and mechanisms that may be needed in addition to RTP to provide a usable service. In particular, for multimedia conferences, a control protocol may distribute multicast addresses and keys for encryption, negotiate the encryption algorithm to be used, and define dynamic mappings between RTP payload type values and the payload formats they represent for formats that do not have a predefined payload type value. Examples of such protocols include the Session Initiation Protocol (SIP) (RFC 3261 [13]), ITU Recommendation H.323 [14] and applications using SDP (RFC 2327 [15]), such as RTSP (RFC 2326 [16]). For simple applications, electronic mail or a conference database may also be used. The specification of such protocols and mechanisms is outside the scope of this document.
非 RTP 途径（Non-RTP means)：除了RTP之外，提供可用服务所需的协议和机制。特别地，对于多媒体会议，控制协议可以分发用于加密的多播地址和密钥，协商要使用的加密算法，并且定义RTP有效负载类型值和它们所表示的有效负载格式之间的动态映射，用于不具有预定义有效负载类型值的格式。此类协议的示例包括会话发起协议（SIP）（RFC 3261[13]）、ITU建议H.323[14]和使用SDP的应用（RFC 2327[15]），例如RTSP（RFC 2326[16]）。对于简单的应用程序，也可以使用电子邮件或会议数据库。此类协议和机制的规范不在本文件的范围内。

4.Byte Order, Alignment, and Time Format 字节顺序、对齐方式和时间格式

All integer fields are carried in network byte order, that is, most significant byte (octet) first. This byte order is commonly known as big-endian. The transmission order is described in detail in [3, Appendix A]. Unless otherwise noted, numeric constants are in decimal (base 10).
所有整数字段都是按网络字节顺序进行的，即首先是最高有效字节（octet）。这种字节顺序通常称为big endian。传输顺序在[3，附录A]中有详细描述。除非另有说明，数字常量是十进制的（以10为基数）。

扩展：
大端模式-数据的低位保存在内存的高地址中，数据的高位保存在内存的低地址中
小端模式-数据的低位保存在内存的低地址中，数据的高位保存在内存的高地址中

All header data is aligned to its natural length, i.e., 16-bit fields are aligned on even offsets, 32-bit fields are aligned at offsets divisible by four, etc. Octets designated as padding have the value zero.
所有头数据都按其自然长度对齐，即16位字段按偶数偏移对齐，32位字段按可被4整除的偏移对齐，等等。需要填充的字节的值设置为0。

Wallclock time (absolute date and time) is represented using the timestamp format of the Network Time Protocol (NTP), which is in seconds relative to 0h UTC on 1 January 1900 [4]. The full resolution NTP timestamp is a 64-bit unsigned fixed-point number with the integer part in the first 32 bits and the fractional part in the last 32 bits. In some fields where a more compact representation is appropriate, only the middle 32 bits are used; that is, the low 16 bits of the integer part and the high 16 bits of the fractional part. The high 16 bits of the integer part must be determined independently.
时钟(Wallclock)时间（绝对日期和时间）使用网络时间协议（NTP）的时间戳格式表示，相对于1900年1月1日的0h UTC以秒为单位[4]。完整的(The full resolution)NTP时间戳是一个64位无符号定点数，整数部分在前32位，小数部分在后32位。在一些更紧凑的表示法适用的字段中，只使用中间的32位，即整数部分的低16位和小数部分的高16位。整数部分的高16位必须独立确定。

扩展：网络时间协议，英文名称：Network Time Protocol（NTP）是用来使计算机时间同步化的一种协议，它可以使计算机对其服务器或时钟源（如石英钟，GPS等等)做同步化，它可以提供高精准度的时间校正（LAN上与标准间差小于1毫秒，WAN上几十毫秒），且可介由加密确认的方式来防止恶毒的协议攻击。NTP的目的是在无序的Internet环境中提供精确和健壮的时间服务。NTP提供准确时间，首先要有准确的时间来源，这一时间应该是国际标准时间UTC。 NTP获得UTC的时间来源可以是原子钟、天文台、卫星，也可以从Internet上获取。这样就有了准确而可靠的时间源。时间按NTP服务器的等级传播。NTP时间同步报文中包含的时间是格林威治时间，是从1900年开始计算的秒数。

An implementation is not required to run the Network Time Protocol in order to use RTP. Other time sources, or none at all, may be used (see the description of the NTP timestamp field in Section 6.4.1). However, running NTP may be useful for synchronizing streams transmitted from separate hosts.
实现RTP不一定非要使用NTP( Network Time Protocol)。可以使用其他时间源，或者根本不使用（请参见第6.4.1节中对NTP时间戳字段的说明）。然而，运行NTP对于同步从不同主机传输的流可能是有用的。

The NTP timestamp will wrap around to zero some time in the year 2036, but for RTP purposes, only differences between pairs of NTP timestamps are used. So long as the pairs of timestamps can be assumed to be within 68 years of each other, using modular arithmetic for subtractions and comparisons makes the wraparound irrelevant.

在2036年的某个时候，NTP时间戳将回绕为零（时间戳溢出），但出于RTP目的，只使用NTP时间戳对之间的差异。只要时间戳对之间的距离可以假定在68年以内，那么使用模运算进行减法和比较就不会受到这时间戳回绕的影响。

扩展：NTP协议的时间戳采用了和Unix类似的32位整数表示，但不同Unix的1970+68年的时间范围，NTP使用的是无符号整型，并且以1900年作为时间的起点，这样意味着它的时间终点将会是1900+136 = 2036年2月6日。超出这个时间，将会导致时间获取异常而无法更新时间。
剩下的章节可以到我的百度网盘去查看：RFC3550翻译，链接码：rfic