Media Source Extensions W3C Editor's Draft 1 October 2012
https://rawgit.com/w3c/media-source/fa8c6f5/media-source.html
Media Source Extensions W3C Editor's Draft 09 January 2015
https://w3c.github.io/media-source/http://www.w3.org/TR/media-source/
WebKit536
https://dvcs.w3.org/hg/html-media/raw-file/e433598d22a7/media-source/media-source.html
W3C Media Source Extensions v0.5 specification
WebKit537
https://dvcs.w3.org/hg/html-media/raw-file/7bab66368f2c/media-source/media-source.html
W3C Media Source Extensions v0.6 specification
Abstract
This proposal extends HTMLMediaElement to allow JavaScript to generate media streams for playback. Allowing JavaScript to generate streams facilitates a variety of use cases likeadaptive streaming and time shifting live streams.
1. Introduction
This proposal allows JavaScript to dynamically construct media streams for<audio> and <video>. It defines objects that allowJavaScript to pass media segments to anHTMLMediaElement
. A buffering model is also included to describe how the user agent should act when different media segments are appended at different times. Byte stream specifications for WebM & ISO Base Media File Format are given to specify the expected format of media segments used with these extensions.
1.1. Goals
This proposal was designed with the following goals in mind:
- Allow JavaScript to construct media streams independent of how the media is fetched.
- Define a splicing and buffering model that facilitates use cases like adaptive streaming, ad-insertion, time-shifting, and video editing.
- Minimize the need for media parsing in JavaScript.
- Leverage the browser cache as much as possible.
- Provide byte stream definitions for WebM & the ISO Base Media File Format.
- Not require support for any particular media format or codec.
1.2. Definitions
1.2.1. Initialization Segment
A sequence of bytes that contains all of the initialization information required to decode a sequence ofmedia segments. This includes codec initialization data, Track ID mappings for multiplexed segments, and timestamp offsets (e.g. edit lists).
- A moov box. WebM
- The concatenation of the the EBML Header, Segment Header, Info element, and Tracks element.
Container specific examples of initialization segments:
ISO Base Media File Format1.2.2. Media Segment
A sequence of bytes that contain packetized & timestamped media data for a portion of the presentation timeline. Media segments are always associated with the most recently appendedinitialization segment.
- A moof box followed by one or more mdat boxes. WebM
- A Cluster element
Container specific examples of media segments:
ISO Base Media File Format1.2.3. Source Buffer
A hypothetical buffer that contains a distinct sequence of initialization segments & media segments. When media segments are passed to append()
they update the state of this buffer. The source buffer only allows a singlemedia segment to cover a specific point in the presentation timeline of each track. If amedia segment gets appended that contains media data overlapping (in presentation time) with media data from an existing segment, then the new media data will override the old media data. Sincemedia segments depend on initialization segments the source buffer is also responsible for maintaining these associations. During playback, the media element pulls segment data out of the source buffers, demultiplexes it if necessary, and enqueues it intotrack buffers so it will get decoded and displayed. buffered
describes the time ranges that are covered bymedia segments in the source buffer.
1.2.4. Active Source Buffers
The set of source buffers that are providing the selected video track
, theenabled audio tracks
, and the"showing"
or"hidden"
text tracks. This is a subset of all the source buffers associated with a specificMediaSource
object. SeeChanges to selected/enabled track state for details.
1.2.5. Track Buffer
A hypothetical buffer that represents initialization and media data for a singleAudioTrack
,VideoTrack
, orTextTrack
that has been queued for playback. This buffer may not exist in actual implementations, but it is intended to represent media data that will be decoded no matter what media segments are appended to update the source buffer. This distinction is important when considering appends that happen close to the current playback position. SeeSource Buffer to Track Buffer transfer for details.
1.2.6. Random Access Point
A position in a media segment where decoding and continuous playback can begin without relying on any previous data in the segment. For video this tends to be the location of I-frames. In the case of audio, most audio frames can be treated as a random access point. Since video tracks tend to have a more sparse distribution of random access points, the location of these points are usually considered the random access points for multiplexed streams.
1.2.7. Presentation Start Time
The presentation start time is the earliest time point in the presentation and specifies theinitial playback position
andearliest possible position
. All presentations created using this specification have a presentation start time of 0. Appendingmedia segments with negative timestamps will cause playback to terminate with aMediaError.MEDIA_ERR_DECODE
error unlesstimestampOffset
is used to make the timestamps greater than or equal to 0.
1.2.8. MediaSource object URL
A MediaSource object URL is a unique Blob URI created by createObjectURL()
. It is used to attach aMediaSource
object to an HTMLMediaElement.
These URLs are the same as what the File API specification calls a Blob URI, except that anything in the definition of that feature that refers toFile andBlob objects is hereby extended to also apply to MediaSource
objects.
1.2.9. Track ID
A Track ID is a byte stream format specific identifier that marks sections of the byte stream as being part of a specific track. The Track ID in atrack description identifies which sections of a media segment belong to that track.
1.2.10. Track Description
A byte stream format specific structure that provides the Track ID, codec configuration, and other metadata for a single track. Each track description inside a singleinitialization segment must have a unique Track ID.